100853 Commits

Author SHA1 Message Date
Mario Limonciello
3a9626c816 drm/amd: Stop evicting resources on APUs in suspend
commit 5095d5418193 ("drm/amd: Evict resources during PM ops prepare()
callback") intentionally moved the eviction of resources to earlier in
the suspend process, but this introduced a subtle change that it occurs
before adev->in_s0ix or adev->in_s3 are set. This meant that APUs
actually started to evict resources at suspend time as well.

Explicitly set s0ix or s3 in the prepare() stage, and unset them if the
prepare() stage failed.

v2: squash in warning fix from Stephen Rothwell

Reported-by: Jürg Billeter <j@bitron.ch>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3132#note_2271038
Fixes: 5095d5418193 ("drm/amd: Evict resources during PM ops prepare() callback")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-15 14:18:43 -05:00
Srinivasan Shanmugam
ccc514b7e7 drm/amd/display: Fix possible buffer overflow in 'find_dcfclk_for_voltage()'
when 'find_dcfclk_for_voltage()' function is looping over
VG_NUM_SOC_VOLTAGE_LEVELS (which is 8), but the size of the DcfClocks
array is VG_NUM_DCFCLK_DPM_LEVELS (which is 7).

When the loop variable i reaches 7, the function tries to access
clock_table->DcfClocks[7]. However, since the size of the DcfClocks
array is 7, the valid indices are 0 to 6. Index 7 is beyond the size of
the array, leading to a buffer overflow.

Reported by smatch & thus fixing the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/clk_mgr/dcn301/vg_clk_mgr.c:550 find_dcfclk_for_voltage() error: buffer overflow 'clock_table->DcfClocks' 7 <= 7

Fixes: 3a83e4e64bb1 ("drm/amd/display: Add dcn3.01 support to DC (v2)")
Cc: Roman Li <Roman.Li@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-15 14:18:43 -05:00
Srinivasan Shanmugam
88c6d84dd8 drm/amd/display: Fix possible use of uninitialized 'max_chunks_fbc_mode' in 'calculate_bandwidth()'
'max_chunks_fbc_mode' is only declared and assigned a value under a
specific condition in the following lines:

if (data->fbc_en[i] == 1) {
	max_chunks_fbc_mode = 128 - dmif_chunk_buff_margin;
}

If 'data->fbc_en[i]' is not equal to 1 for any i, max_chunks_fbc_mode
will not be initialized if it's used outside of this for loop.

Ensure that 'max_chunks_fbc_mode' is properly initialized before it's
used. Initialize it to a default value right after its declaration to
ensure that it gets a value assigned under all possible control flow
paths.

Thus fixing the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/basics/dce_calcs.c:914 calculate_bandwidth() error: uninitialized symbol 'max_chunks_fbc_mode'.
drivers/gpu/drm/amd/amdgpu/../display/dc/basics/dce_calcs.c:917 calculate_bandwidth() error: uninitialized symbol 'max_chunks_fbc_mode'.

Fixes: 4562236b3bc0 ("drm/amd/dc: Add dc display driver (v2)")
Cc: Harry Wentland <harry.wentland@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-15 14:18:43 -05:00
Srinivasan Shanmugam
7edb5830ec drm/amd/display: Initialize 'wait_time_microsec' variable in link_dp_training_dpia.c
wait_time_microsec = max(wait_time_microsec, (uint32_t)
DPIA_CLK_SYNC_DELAY);

Above line is trying to assign the maximum value between
'wait_time_microsec' and 'DPIA_CLK_SYNC_DELAY' to wait_time_microsec.
However, 'wait_time_microsec' has not been assigned a value before this
line, initialize 'wait_time_microsec' at the point of declaration.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/link/protocols/link_dp_training_dpia.c:697 dpia_training_eq_non_transparent() error: uninitialized symbol 'wait_time_microsec'.

Fixes: 630168a97314 ("drm/amd/display: move dp link training logic to link_dp_training")
Cc: Wenjing Liu <wenjing.liu@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-15 14:18:42 -05:00
Dan Carpenter
17ba9cde11 drm/amd/display: Fix && vs || typos
These ANDs should be ORs or it will lead to a NULL dereference.

Fixes: fb5a3d037082 ("drm/amd/display: Add NULL test for 'timing generator' in 'dcn21_set_pipe()'")
Fixes: 886571d217d7 ("drm/amd/display: Fix 'panel_cntl' could be null in 'dcn21_set_backlight_level()'")
Reviewed-by: Anthony Koo <anthony.koo@amd.com>
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-15 14:18:42 -05:00
Kent Russell
a0c9956a8d drm/amdkfd: Fix L2 cache size reporting in GFX9.4.3
Its currently incorrectly multiplied by number of XCCs in the partition

Fixes: be457b2252b6 ("drm/amdkfd: Update cache info for GFX 9.4.3")
Signed-off-by: Kent Russell <kent.russell@amd.com>
Reviewed-by: Mukul Joshi <mukul.joshi@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-15 14:18:42 -05:00
Hamza Mahfooz
d16df040c8 drm/amdgpu: make damage clips support configurable
We have observed that there are quite a number of PSR-SU panels on the
market that are unable to keep up with what user space throws at them,
resulting in hangs and random black screens. So, make damage clips
support configurable and disable it by default for PSR-SU displays.

Cc: stable@vger.kernel.org
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-15 14:18:42 -05:00
Dave Airlie
311520887d Merge tag 'drm-msm-fixes-2024-02-07' of https://gitlab.freedesktop.org/drm/msm into drm-fixes
Fixes for v6.8-rc4

DPU:
- fix for kernel doc warnings and smatch warnings in dpu_encoder
- fix for smatch warning in dpu_encoder
- fix the bus bandwidth value for SDM670

DP:
- fixes to handle unknown bpc case correctly for DP. The current code was
  spilling over into other bits of DP configuration register, had to be
  fixed to avoid the extra shifts which were causing the spill over
- fix for MISC0 programming in DP driver to program the correct
  colorimetry value

GPU:
- dmabuf vmap fix
- a610 UBWC corruption fix (incorrect hbb)
- revert a commit that was making GPU recovery unreliable

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rob Clark <robdclark@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/CAF6AEGv+tb1+_cp7ftxcMZbbxE9810rvxeaC50eL=msQ+zkm0g@mail.gmail.com
2024-02-09 11:32:38 +10:00
Dave Airlie
b30bed9d00 amd-drm-fixes-6.8-2024-02-08:
amdgpu:
 - Misc NULL/bounds check fixes
 - ODM pipe policy fix
 - Aborted suspend fixes
 - JPEG 4.0.5 fix
 - DCN 3.5 fixes
 - PSP fix
 - DP MST fix
 - Phantom pipe fix
 - VRAM vendor fix
 - Clang fix
 - SR-IOV fix
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZcUGhAAKCRC93/aFa7yZ
 2Pp8AP4shwYR9OwKXx13nspSD/le4qvxHRrodeaBtSARWYXDywEA9wt+NZ3r/izk
 80R970NODmOawhixNRex08ggd2TTZwg=
 =25qe
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-fixes-6.8-2024-02-08' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.8-2024-02-08:

amdgpu:
- Misc NULL/bounds check fixes
- ODM pipe policy fix
- Aborted suspend fixes
- JPEG 4.0.5 fix
- DCN 3.5 fixes
- PSP fix
- DP MST fix
- Phantom pipe fix
- VRAM vendor fix
- Clang fix
- SR-IOV fix

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240208165500.4887-1-alexander.deucher@amd.com
2024-02-09 11:21:23 +10:00
Dave Airlie
9da93fe430 Merge tag 'drm-intel-fixes-2024-02-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes
- Just includes gvt-fixes-2024-02-05

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZcTETgXsejwVwat6@jlahtine-mobl.ger.corp.intel.com
2024-02-09 11:17:57 +10:00
Dave Airlie
60c16201b6 Driver Changes:
- Fix a loop in an error path
 - Fix a missing dma-fence reference
 - Fix a retry path on userptr REMAP
 - Workaround for a false gcc warning
 - Fix missing map of the usm batch buffer
   in the migrate vm.
 - Fix a memory leak.
 - Fix a bad assumption of used page size
 - Fix hitting a BUG() due to zero pages to map.
 - Remove some leftover async bind queue relics
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRskUM7w1oG5rx2IZO4FpNVCsYGvwUCZcS1UgAKCRC4FpNVCsYG
 v8wMAQCfZF54SM76+feBVRzwXxhjAhqe2q9v/hZMHJOsF8Be7AD+PKhCfitzxKgJ
 m+K9kGwyI2Kv0VEQjIVPkucKEHA0FQc=
 =YJd2
 -----END PGP SIGNATURE-----

Merge tag 'drm-xe-fixes-2024-02-08' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

Driver Changes:
- Fix a loop in an error path
- Fix a missing dma-fence reference
- Fix a retry path on userptr REMAP
- Workaround for a false gcc warning
- Fix missing map of the usm batch buffer
  in the migrate vm.
- Fix a memory leak.
- Fix a bad assumption of used page size
- Fix hitting a BUG() due to zero pages to map.
- Remove some leftover async bind queue relics

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZcS2LllawGifubsk@fedora
2024-02-09 11:12:09 +10:00
Dave Airlie
6c2bf9ca24 A null pointer dereference fix for v3d, a TTM pool initialization fix,
several fixes for nouveau around register size, DMA buffer leaks and API
 consistency, a multiple fixes for ivpu around MMU setup, initialization
 and firmware interactions.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZcTCuAAKCRDj7w1vZxhR
 xTI/AP4wnENmQLfNjGGAZNoAPgbonxSCsy2lNLDPj2LKCckpLgEApf0b8UCpbWGd
 3obSP91CD/WzyT6v+txDhXmG7STU3g0=
 =R6LM
 -----END PGP SIGNATURE-----

Merge tag 'drm-misc-fixes-2024-02-08' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

A null pointer dereference fix for v3d, a TTM pool initialization fix,
several fixes for nouveau around register size, DMA buffer leaks and API
consistency, a multiple fixes for ivpu around MMU setup, initialization
and firmware interactions.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Maxime Ripard <mripard@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/4wsi2i6kgkqdu7nzp4g7hxasbswnrmc5cakgf5zzvnix53u7lr@4rmp7hwblow3
2024-02-09 11:11:21 +10:00
Matthew Brost
bf4c27b826 drm/xe: Remove TEST_VM_ASYNC_OPS_ERROR
TEST_VM_ASYNC_OPS_ERROR is broken and unused. Remove for now and will
pull back in a later time when it is used, fixed, and properly hidden
behind a Kconfig option. Also fixup the supported flags value.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240206045010.2981051-1-matthew.brost@intel.com
(cherry picked from commit d9890c028d66a9e1ee3cccaa081ab5aedcbfe431)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-08 09:51:19 +01:00
Matthew Auld
9e3fc1d65d drm/xe/vm: don't ignore error when in_kthread
If GUP fails and we are in_kthread, we can have pinned = 0 and ret = 0.
If that happens we call sg_alloc_append_table_from_pages() with n_pages
= 0, which is not well behaved and can trigger:

kernel BUG at include/linux/scatterlist.h:115!

depending on if the pages array happens to be zeroed or not. Even if we
don't hit that it crashes later when trying to dma_map the returned
table.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202171435.427630-2-matthew.auld@intel.com
(cherry picked from commit 8087199cd5951c1eba26003b3e4296dbb2110adf)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-08 09:51:04 +01:00
Matthew Brost
95c058c8ef drm/xe: Assume large page size if VMA not yet bound
The calculation to determine max page size of a VMA during a REMAP
operations assumes the VMA has been bound. This assumption is not true
if the VMA is from an eariler operation in an array of binds. If a VMA
has not been bound use the maximum page size which will ensure the
previous / next REMAP operations are not incorrectly skipped.

Fixes: 8f33b4f054fc ("drm/xe: Avoid doing rebinds")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240205231714.2956225-1-matthew.brost@intel.com
(cherry picked from commit 5ad6af5c91e9b942c44b657122270d935db3a813)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-08 09:50:50 +01:00
Xiaoming Wang
11572b3f68 drm/xe/display: Fix memleak in display initialization
intel_power_domains_init is called twice in xe_device_probe:

1) intel_power_domains_init()
   xe_display_init_nommio()
   xe_device_probe()

2) intel_power_domains_init()
   intel_display_driver_probe_noirq()
   xe_display_init_noirq()
   xe_device_probe()

It needs remove one to avoid power_domains->power_wells double malloc.

unreferenced object 0xffff88811150ee00 (size 512):
  comm "systemd-udevd", pid 506, jiffies 4294674198 (age 3605.560s)
  hex dump (first 32 bytes):
    10 b4 9d a0 ff ff ff ff ff ff ff ff ff ff ff ff  ................
    ff ff ff ff ff ff ff ff 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8134b901>] __kmem_cache_alloc_node+0x1c1/0x2b0
    [<ffffffff812c98b2>] __kmalloc+0x52/0x150
    [<ffffffffa08b0033>] __set_power_wells+0xc3/0x360 [xe]
    [<ffffffffa08562fc>] xe_display_init_nommio+0x4c/0x70 [xe]
    [<ffffffffa07f0d1c>] xe_device_probe+0x3c/0x5a0 [xe]
    [<ffffffffa082e48f>] xe_pci_probe+0x33f/0x5a0 [xe]
    [<ffffffff817f2187>] local_pci_probe+0x47/0xa0
    [<ffffffff817f3db3>] pci_device_probe+0xc3/0x1f0
    [<ffffffff8192f2a2>] really_probe+0x1a2/0x410
    [<ffffffff8192f598>] __driver_probe_device+0x78/0x160
    [<ffffffff8192f6ae>] driver_probe_device+0x1e/0x90
    [<ffffffff8192f92a>] __driver_attach+0xda/0x1d0
    [<ffffffff8192c95c>] bus_for_each_dev+0x7c/0xd0
    [<ffffffff8192e159>] bus_add_driver+0x119/0x220
    [<ffffffff81930d00>] driver_register+0x60/0x120
    [<ffffffffa05e50a0>] 0xffffffffa05e50a0

The call to intel_power_domains_cleanup() needs to stay where it is for
now. The main issue is that while the init is called by the display
side, shared by i915 and xe, the cleanup is called by a non-shared code
path. Fixing that will be done as a separate commit.

Fixes: 44e694958b95 ("drm/xe/display: Implement display support")
Signed-off-by: Xiaoming Wang <xiaoming.wang@intel.com>
[ reword commit message and explain why the fini needs to stay
  where it is ]
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202215658.561298-1-lucas.demarchi@intel.com
(cherry picked from commit 86c99abb5f1b6fcd69fb268eeb2e34cb7c4f355c)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-08 09:50:20 +01:00
Matthew Brost
3aa3c5c249 drm/xe: Map both mem.kernel_bb_pool and usm.bb_pool
For integrated devices we need to map both mem.kernel_bb_pool and
usm.bb_pool to be able to run batches from both pools.

Fixes: a682b6a42d4d ("drm/xe: Support device page faults on integrated platforms")
Tested-by: Brian Welty <brian.welty@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Brian Welty <brian.welty@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202033440.2351862-1-matthew.brost@intel.com
(cherry picked from commit 72f86ed3c88933d6fa09b036de93621ea71097a7)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-08 09:49:59 +01:00
Arnd Bergmann
90773aaf91 drm/xe: circumvent bogus stringop-overflow warning
gcc-13 warns about an array overflow that it sees but that is
prevented by the "asid % NUM_PF_QUEUE" calculation:

drivers/gpu/drm/xe/xe_gt_pagefault.c: In function 'xe_guc_pagefault_handler':
include/linux/fortify-string.h:57:33: error: writing 16 bytes into a region of size 0 [-Werror=stringop-overflow=]
include/linux/fortify-string.h:689:26: note: in expansion of macro '__fortify_memcpy_chk'
  689 | #define memcpy(p, q, s)  __fortify_memcpy_chk(p, q, s,                  \
      |                          ^~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/xe/xe_gt_pagefault.c:341:17: note: in expansion of macro 'memcpy'
  341 |                 memcpy(pf_queue->data + pf_queue->tail, msg, len * sizeof(u32));
      |                 ^~~~~~
drivers/gpu/drm/xe/xe_gt_types.h:102:25: note: at offset [1144, 265324] into destination object 'tile' of size 8

I found that rewriting the assignment using pointer addition rather than the
equivalent array index calculation prevents the warning, so use that instead.

I sent a bug report against gcc for the false positive warning.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113214
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240103114819.2913937-1-arnd@kernel.org
(cherry picked from commit 774ef5dfc95578a9079426d5106076dcd59c4dfa)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-08 09:49:41 +01:00
Matthew Brost
21abf108a0 drm/xe: Pick correct userptr VMA to repin on REMAP op failure
A REMAP op is composed of 3 VMA's - unmap, prev map, and next map. When
op_execute fails with -EAGAIN we need to update the local VMA pointer to
the current op state and then repin the VMA if it is a userptr.

Fixes a failure seen in xe_vm.munmap-style-unbind-userptr-one-partial.

Fixes: b06d47be7c83 ("drm/xe: Port Xe to GPUVA")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240201004849.2219558-3-matthew.brost@intel.com
(cherry picked from commit 447f74d223b4f6cbab74963bf1099050c15374ce)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-08 09:49:29 +01:00
Matthew Brost
fc29b6d5ab drm/xe: Take a reference in xe_exec_queue_last_fence_get()
Take a reference in xe_exec_queue_last_fence_get(). Also fix a reference
counting underflow bug VM bind and unbind.

Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240201004849.2219558-2-matthew.brost@intel.com
(cherry picked from commit a856b67a84169e065ebbeee50258936b1eacc9eb)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-08 09:49:22 +01:00
Matthew Brost
ddc7d4c584 drm/xe: Fix loop in vm_bind_ioctl_ops_unwind
The logic for the unwind loop is incorrect resulting in an infinite
loop. Fix to unwind to go from the last operations list to he first.

Fixes: 617eebb9c480 ("drm/xe: Fix array of binds")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240201175532.2303168-1-matthew.brost@intel.com
(cherry picked from commit 3acc1ff1a72fce00cdbd3ef1c27108a967fd5616)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2024-02-08 09:49:10 +01:00
Lijo Lazar
534c8a5b9d drm/amdgpu: Fix HDP flush for VFs on nbio v7.9
HDP flush remapping is not done for VFs. Keep the original offsets in VF
environment.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:30:11 -05:00
Srinivasan Shanmugam
58fca355ad drm/amd/display: Implement bounds check for stream encoder creation in DCN301
'stream_enc_regs' array is an array of dcn10_stream_enc_registers
structures. The array is initialized with four elements, corresponding
to the four calls to stream_enc_regs() in the array initializer. This
means that valid indices for this array are 0, 1, 2, and 3.

The error message 'stream_enc_regs' 4 <= 5 below, is indicating that
there is an attempt to access this array with an index of 5, which is
out of bounds. This could lead to undefined behavior

Here, eng_id is used as an index to access the stream_enc_regs array. If
eng_id is 5, this would result in an out-of-bounds access on the
stream_enc_regs array.

Thus fixing Buffer overflow error in dcn301_stream_encoder_create
reported by Smatch:
drivers/gpu/drm/amd/amdgpu/../display/dc/resource/dcn301/dcn301_resource.c:1011 dcn301_stream_encoder_create() error: buffer overflow 'stream_enc_regs' 4 <= 5

Fixes: 3a83e4e64bb1 ("drm/amd/display: Add dcn3.01 support to DC (v2)")
Cc: Roman Li <Roman.Li@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Roman Li <roman.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:30:04 -05:00
Nathan Chancellor
e63e35f016 drm/amd/display: Increase frame-larger-than for all display_mode_vba files
After a recent change in LLVM, allmodconfig (which has CONFIG_KCSAN=y
and CONFIG_WERROR=y enabled) has a few new instances of
-Wframe-larger-than for the mode support and system configuration
functions:

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn20/display_mode_vba_20v2.c:3393:6: error: stack frame size (2144) exceeds limit (2048) in 'dml20v2_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
   3393 | void dml20v2_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
        |      ^
  1 error generated.

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn21/display_mode_vba_21.c:3520:6: error: stack frame size (2192) exceeds limit (2048) in 'dml21_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
   3520 | void dml21_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
        |      ^
  1 error generated.

  drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn20/display_mode_vba_20.c:3286:6: error: stack frame size (2128) exceeds limit (2048) in 'dml20_ModeSupportAndSystemConfigurationFull' [-Werror,-Wframe-larger-than]
   3286 | void dml20_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_lib)
        |      ^
  1 error generated.

Without the sanitizers enabled, there are no warnings.

This was the catalyst for commit 6740ec97bcdb ("drm/amd/display:
Increase frame warning limit with KASAN or KCSAN in dml2") and that same
change was made to dml in commit 5b750b22530f ("drm/amd/display:
Increase frame warning limit with KASAN or KCSAN in dml") but the
frame_warn_flag variable was not applied to all files. Do so now to
clear up the warnings and make all these files consistent.

Cc: stable@vger.kernel.org
Closes: https://github.com/ClangBuiltLinux/linux/issue/1990
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:29:09 -05:00
Mario Limonciello
da48914e1f drm/amd/display: Clear phantom stream count and plane count
When dc_state_destruct() was refactored the new phantom_stream_count
and phantom_plane_count members weren't cleared.

Fixes: 012a04b1d6af ("drm/amd/display: Refactor phantom resource allocation")
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:28:55 -05:00
Lijo Lazar
55173942a6 drm/amdgpu: Avoid fetching VRAM vendor info
The present way to fetch VRAM vendor information turns out to be not
reliable on GFX 9.4.3 dGPUs as well. Avoid using the data.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2024-02-07 18:28:31 -05:00
Rodrigo Siqueira
29c5da1a12 drm/amd/display: Disable ODM by default for DCN35
Just ensure that ODM optimization is disabled by default.

Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:27:49 -05:00
Alvin Lee
ca8179ba11 drm/amd/display: Update phantom pipe enable / disable sequence
Previously we would call apply_ctx_to_hw to enable and disable
phantom pipes. However, apply_ctx_to_hw can potentially update
non-phantom pipes as well which is undesired. Instead of calling
apply_ctx_to_hw as a whole, call the relevant helpers for each
phantom pipe when enabling / disabling which will avoid us modifying
hardware state for non-phantom pipes unknowingly.

The use case is for an FRL display where FRL_Update is requested
by the display. In this case link_state_valid flag is cleared in
a passive callback thread and should be handled in the next stream /
link update. However, due to the call to apply_ctx_to_hw for the
phantom pipes during a flip, the main pipes were modified outside
of the desired sequence (driver does not handle link_state_valid = 0
on flips).

Cc: stable@vger.kernel.org # 6.6+
Reviewed-by: Samson Tam <samson.tam@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alvin Lee <alvin.lee2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:27:09 -05:00
Fangzhi Zuo
e6a7df96fa drm/amd/display: Fix MST Null Ptr for RV
The change try to fix below error specific to RV platform:

BUG: kernel NULL pointer dereference, address: 0000000000000008
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP NOPTI
CPU: 4 PID: 917 Comm: sway Not tainted 6.3.9-arch1-1 #1 124dc55df4f5272ccb409f39ef4872fc2b3376a2
Hardware name: LENOVO 20NKS01Y00/20NKS01Y00, BIOS R12ET61W(1.31 ) 07/28/2022
RIP: 0010:drm_dp_atomic_find_time_slots+0x5e/0x260 [drm_display_helper]
Code: 01 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d 2e 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8>
RSP: 0018:ffff960cc2df77d8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff8afb87e81280 RCX: 0000000000000224
RDX: ffff8afb9ee37c00 RSI: ffff8afb8da1a578 RDI: ffff8afb87e81280
RBP: ffff8afb83d67000 R08: 0000000000000001 R09: ffff8afb9652f850
R10: ffff960cc2df7908 R11: 0000000000000002 R12: 0000000000000000
R13: ffff8afb8d7688a0 R14: ffff8afb8da1a578 R15: 0000000000000224
FS:  00007f4dac35ce00(0000) GS:ffff8afe30b00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000010ddc6000 CR4: 00000000003506e0
Call Trace:
 <TASK>
 ? __die+0x23/0x70
 ? page_fault_oops+0x171/0x4e0
 ? plist_add+0xbe/0x100
 ? exc_page_fault+0x7c/0x180
 ? asm_exc_page_fault+0x26/0x30
 ? drm_dp_atomic_find_time_slots+0x5e/0x260 [drm_display_helper 0e67723696438d8e02b741593dd50d80b44c2026]
 ? drm_dp_atomic_find_time_slots+0x28/0x260 [drm_display_helper 0e67723696438d8e02b741593dd50d80b44c2026]
 compute_mst_dsc_configs_for_link+0x2ff/0xa40 [amdgpu 62e600d2a75e9158e1cd0a243bdc8e6da040c054]
 ? fill_plane_buffer_attributes+0x419/0x510 [amdgpu 62e600d2a75e9158e1cd0a243bdc8e6da040c054]
 compute_mst_dsc_configs_for_state+0x1e1/0x250 [amdgpu 62e600d2a75e9158e1cd0a243bdc8e6da040c054]
 amdgpu_dm_atomic_check+0xecd/0x1190 [amdgpu 62e600d2a75e9158e1cd0a243bdc8e6da040c054]
 drm_atomic_check_only+0x5c5/0xa40
 drm_mode_atomic_ioctl+0x76e/0xbc0
 ? _copy_to_user+0x25/0x30
 ? drm_ioctl+0x296/0x4b0
 ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
 drm_ioctl_kernel+0xcd/0x170
 drm_ioctl+0x26d/0x4b0
 ? __pfx_drm_mode_atomic_ioctl+0x10/0x10
 amdgpu_drm_ioctl+0x4e/0x90 [amdgpu 62e600d2a75e9158e1cd0a243bdc8e6da040c054]
 __x64_sys_ioctl+0x94/0xd0
 do_syscall_64+0x60/0x90
 ? do_syscall_64+0x6c/0x90
 entry_SYSCALL_64_after_hwframe+0x72/0xdc
RIP: 0033:0x7f4dad17f76f
Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <89> c>
RSP: 002b:00007ffd9ae859f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 000055e255a55900 RCX: 00007f4dad17f76f
RDX: 00007ffd9ae85a90 RSI: 00000000c03864bc RDI: 000000000000000b
RBP: 00007ffd9ae85a90 R08: 0000000000000003 R09: 0000000000000003
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000c03864bc
R13: 000000000000000b R14: 000055e255a7fc60 R15: 000055e255a01eb0
 </TASK>
Modules linked in: rfcomm snd_seq_dummy snd_hrtimer snd_seq snd_seq_device ccm cmac algif_hash algif_skcipher af_alg joydev mousedev bnep >
 typec libphy k10temp ipmi_msghandler roles i2c_scmi acpi_cpufreq mac_hid nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_mas>
CR2: 0000000000000008
---[ end trace 0000000000000000 ]---
RIP: 0010:drm_dp_atomic_find_time_slots+0x5e/0x260 [drm_display_helper]
Code: 01 00 00 48 8b 85 60 05 00 00 48 63 80 88 00 00 00 3b 43 28 0f 8d 2e 01 00 00 48 8b 53 30 48 8d 04 80 48 8d 04 c2 48 8b 40 18 <48> 8>
RSP: 0018:ffff960cc2df77d8 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff8afb87e81280 RCX: 0000000000000224
RDX: ffff8afb9ee37c00 RSI: ffff8afb8da1a578 RDI: ffff8afb87e81280
RBP: ffff8afb83d67000 R08: 0000000000000001 R09: ffff8afb9652f850
R10: ffff960cc2df7908 R11: 0000000000000002 R12: 0000000000000000
R13: ffff8afb8d7688a0 R14: ffff8afb8da1a578 R15: 0000000000000224
FS:  00007f4dac35ce00(0000) GS:ffff8afe30b00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 000000010ddc6000 CR4: 00000000003506e0

With a second DP monitor connected, drm_atomic_state in dm atomic check
sequence does not include the connector state for the old/existing/first
DP monitor. In such case, dsc determination policy would hit a null ptr
when it tries to iterate the old/existing stream that does not have a
valid connector state attached to it. When that happens, dm atomic check
should call drm_atomic_get_connector_state for a new connector state.
Existing dm has already done that, except for RV due to it does not have
official support of dsc where .num_dsc is not defined in dcn10 resource
cap, that prevent from getting drm_atomic_get_connector_state called.
So, skip dsc determination policy for ASICs that don't have DSC support.

Cc: stable@vger.kernel.org # 6.1+
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2314
Reviewed-by: Wayne Lin <wayne.lin@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Fangzhi Zuo <jerry.zuo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:22:56 -05:00
Stanley.Yang
2dcf82a8e8 drm/amdgpu: Fix shared buff copy to user
ta if invoke node buffer
|-------- ta type ----------|
|--------  ta id  ----------|
|-------- cmd  id ----------|
|------ shared buf len -----|
|------ shared buffer ------|

ta if invoke node buffer is as above, copy shared buffer data to correct location

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:22:04 -05:00
Nicholas Kazlauskas
280df4996c drm/amd/display: Increase eval/entry delay for DCN35
[Why]
To match firmware measurements and avoid hanging when accessing HW
that's in idle.

[How]
Increase the delays to what we've measured.

Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:21:27 -05:00
Li Ma
897925dcc5 drm/amdgpu: remove asymmetrical irq disabling in jpeg 4.0.5 suspend
A supplement to commit: 615dd56ac5379f4239940be69139a33e79e59c67
There is an irq warning of jpeg during resume in s2idle process. No irq enabled in jpeg 4.0.5 resume.

Fixes: 615dd56ac537 ("drm/amdgpu: remove asymmetrical irq disabling in vcn 4.0.5 suspend")
Signed-off-by: Li Ma <li.ma@amd.com>
Acked-By: Saleemkhan Jamadar <saleemkhan.jamadar@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:19:22 -05:00
Prike Liang
6ef82ac664 drm/amdgpu: reset gpu for s3 suspend abort case
In the s3 suspend abort case some type of gfx9 power
rail not turn off from FCH side and this will put the
GPU in an unknown power status, so let's reset the gpu
to a known good power state before reinitialize gpu
device.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:19:08 -05:00
Prike Liang
93bafa32a6 drm/amdgpu: skip to program GFXDEC registers for suspend abort
In the suspend abort cases, the gfx power rail doesn't turn off so
some GFXDEC registers/CSB can't reset to default value and at this
moment reinitialize GFXDEC/CSB will result in an unexpected error.
So let skip those program sequence for the suspend abort case.

Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:19:04 -05:00
Wenjing Liu
2103370afb drm/amd/display: set odm_combine_policy based on context in dcn32 resource
[why]
When populating dml pipes, odm combine policy should be assigned based
on the pipe topology of the context passed in. DML pipes could be
repopulated multiple times during single validate bandwidth attempt. We
need to make sure that whenever we repopulate the dml pipes it is always
aligned with the updated context. There is a case where DML pipes get
repopulated during FPO optimization after ODM combine policy is changed.
Since in the current code we reinitlaize ODM combine policy, even though
the current context has ODM combine enabled, we overwrite it despite the
pipes are already split. This causes DML to think that MPC combine is
used so we mistakenly enable MPC combine because we apply pipe split
with ODM combine policy reset. This issue doesn't impact non windowed
MPO with ODM case because the legacy policy has restricted use cases. We
don't encounter the case where both ODM and FPO optimizations are
enabled together. So we decide to leave it as is because it is about to
be replaced anyway.

Cc: stable@vger.kernel.org # 6.6+
Reviewed-by: Chaitanya Dhere <chaitanya.dhere@amd.com>
Reviewed-by: Alvin Lee <alvin.lee2@amd.com>
Acked-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Wenjing Liu <wenjing.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:16:46 -05:00
Srinivasan Shanmugam
66951d98d9 drm/amd/display: Add NULL test for 'timing generator' in 'dcn21_set_pipe()'
In "u32 otg_inst = pipe_ctx->stream_res.tg->inst;"
pipe_ctx->stream_res.tg could be NULL, it is relying on the caller to
ensure the tg is not NULL.

Fixes: 474ac4a875ca ("drm/amd/display: Implement some asic specific abm call backs.")
Cc: Yongqiang Sun <yongqiang.sun@amd.com>
Cc: Anthony Koo <Anthony.Koo@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:14:56 -05:00
Srinivasan Shanmugam
e96fddb329 drm/amd/display: Fix 'panel_cntl' could be null in 'dcn21_set_backlight_level()'
'panel_cntl' structure used to control the display panel could be null,
dereferencing it could lead to a null pointer access.

Fixes the below:
drivers/gpu/drm/amd/amdgpu/../display/dc/hwss/dcn21/dcn21_hwseq.c:269 dcn21_set_backlight_level() error: we previously assumed 'panel_cntl' could be null (see line 250)

Fixes: 474ac4a875ca ("drm/amd/display: Implement some asic specific abm call backs.")
Cc: Yongqiang Sun <yongqiang.sun@amd.com>
Cc: Anthony Koo <Anthony.Koo@amd.com>
Cc: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Anthony Koo <Anthony.Koo@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2024-02-07 18:14:39 -05:00
Matthew Brost
d0399da9fb drm/sched: Re-queue run job worker when drm_sched_entity_pop_job() returns NULL
Rather then loop over entities until one with a ready job is found,
re-queue the run job worker when drm_sched_entity_pop_job() returns NULL.

Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Fixes: 66dbd9004a55 ("drm/sched: Drain all entities in DRM sched run job worker")
Reviewed-by: Luben Tuikov <ltuikov89@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240130030413.2031009-1-matthew.brost@intel.com
2024-02-06 12:47:43 +10:00
Timur Tabi
34e659f34a drm/nouveau: nvkm_gsp_radix3_sg() should use nvkm_gsp_mem_ctor()
Function nvkm_gsp_radix3_sg() uses nvkm_gsp_mem objects to allocate the
radix3 tables, but it unnecessarily creates those objects manually
instead of using the standard nvkm_gsp_mem_ctor() function like the
rest of the code does.

Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202230608.1981026-2-ttabi@nvidia.com
2024-02-05 18:41:09 +01:00
Timur Tabi
042b5f8384 drm/nouveau: fix several DMA buffer leaks
Nouveau manages GSP-RM DMA buffers with nvkm_gsp_mem objects.  Several of
these buffers are never dealloced.  Some of them can be deallocated
right after GSP-RM is initialized, but the rest need to stay until the
driver unloads.

Also futher bullet-proof these objects by poisoning the buffer and
clearing the nvkm_gsp_mem object when it is deallocated.  Poisoning
the buffer should trigger an error (or crash) from GSP-RM if it tries
to access the buffer after we've deallocated it, because we were wrong
about when it is safe to deallocate.

Finally, change the mem->size field to a size_t because that's the same
type that dma_alloc_coherent expects.

Cc: <stable@vger.kernel.org> # v6.7
Fixes: 176fdcbddfd2 ("drm/nouveau/gsp/r535: add support for booting GSP-RM")
Signed-off-by: Timur Tabi <ttabi@nvidia.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240202230608.1981026-1-ttabi@nvidia.com
2024-02-05 18:25:13 +01:00
Dave Airlie
61712c9478 nouveau/gsp: use correct size for registry rpc.
Timur pointed this out before, and it just slipped my mind,
but this might help some things work better, around pcie power
management.

Cc: <stable@vger.kernel.org> # v6.7
Fixes: 8d55b0a940bb ("nouveau/gsp: add some basic registry entries.")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Danilo Krummrich <dakr@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240130032643.2498315-1-airlied@gmail.com
2024-02-05 17:36:48 +01:00
Joonas Lahtinen
a99682e839 Merge tag 'gvt-fixes-2024-02-05' of https://github.com/intel/gvt-linux into drm-intel-fixes
gvt-fixes-2024-02-05

- Fix broken gvt doc link (Zhenyu)
- Fix one uninitialized variable bug in warning (Dan)
- Update Zhi's new email address in MAINTAINERS file. (Zhi)

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
From: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZcBULqJAL2CWJoHh@debian-scheme
2024-02-05 15:56:47 +02:00
Maxime Ripard
4856380063 A null pointer dereference fix for v3d and a protection fault fix for
ttm.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZapnVgAKCRDj7w1vZxhR
 xStMAQDK8YH1S3cvNO6DEpge7tPc8NoUoDUd+O9ZefhL6+qT7gEAlNWcjSpxqTnX
 3NCLJbITAAvG58edL809JeW+JiBEUAA=
 =h3CM
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCZcDErgAKCRDj7w1vZxhR
 xcd2AP4yEGlt0fG5cMzww2Ct2NH7YAan/6o/WTkDTRGt3SoV9QEA6FSVQgXbOOkL
 hXh9qKMVar4rpe1D8fCy0RVpjvwF8gA=
 =WUjQ
 -----END PGP SIGNATURE-----

Merge drm-misc-next-fixes-2024-01-19 into drm-misc-fixes

Merge the last drm-misc-next-fixes tag that fell through the cracks.

Signed-off-by: Maxime Ripard <mripard@kernel.org>
2024-02-05 12:20:52 +01:00
Dan Carpenter
47caa96478 drm/i915/gvt: Fix uninitialized variable in handle_mmio()
This code prints the wrong variable in the warning message.  It should
print "i" instead of "info->offset".  On the first iteration "info" is
uninitialized leading to a crash and on subsequent iterations it prints
the previous offset instead of the current one.

Fixes: e0f74ed4634d ("i915/gvt: Separate the MMIO tracking table from GVT-g")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/11957c20-b178-4027-9b0a-e32e9591dd7c@moroto.mountain
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
2024-02-05 11:16:26 +08:00
Zhenyu Wang
1a00897e5e drm/i915: Replace dead 01.org link
01.org is dead so replace old gvt link with current wiki page.

Acked-by: Jani Nikula <jani.nikula@intel.com>
Reviewed-by: Zhi Wang <zhi.wang.linux@gmail.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20230804040544.1972958-1-zhenyuw@linux.intel.com
2024-02-05 11:15:26 +08:00
Linus Torvalds
9c2f0338bb drm fixes for 6.8-rc3
dma-buf:
 - heaps CMA page accounting fix
 
 virtio-gpu:
 - fix segment size
 
 xe:
 - A crash fix
 - A fix for an assert due to missing mem_acces ref
 - Only allow a single user-fence per exec / bind.
 - Some sparse warning fixes
 - Two fixes for compilation failures on various odd
   combinations of gcc / arch pointed out on LKML.
 - Fix a fragile partial allocation pointed out on LKML.
 - A sysfs ABI documentation warning fix
 
 amdgpu:
 - Fix reboot issue seen on some 7000 series dGPUs
 - Fix client init order for KFD
 - Misc display fixes
 - USB-C fix
 - DCN 3.5 fixes
 - Fix issues with GPU scheduler and GPU reset
 - GPU firmware loading fix
 - Misc fixes
 - GC 11.5 fix
 - VCN 4.0.5 fix
 - IH overflow fix
 
 amdkfd:
 - SVM fixes
 - Trap handler fix
 - Fix device permission lookup
 - Properly reserve BO before validating it
 
 nouveau:
 - fence/irq lock deadlock fix (second attempt)
 - gsp command size fix
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEEKbZHaGwW9KfbeusDHTzWXnEhr4FAmW9RgIACgkQDHTzWXnE
 hr4LnQ/9FFwGKMf8a0QIAWQFQVHLa4IMfjR8l3ppqXIZWlxG1LE/TddLRsS1xqA7
 QmliCFDuXHXoOBPTx5/vUBt7QuomJ2QnVOmcDbyym6Oae2XLQK8H8VRuGB33lGJo
 Qj7YWX5pml6jmWjG1j1b3M+Si0subiOhUCaB4AsIfUahUs8XeKbVXs3e11aaZyLK
 YgqFBKsPaT8x9foh0RC5X/i1QspVGUSUNtvEEi88t5ZrTfyWuL0HBWbtZYub4b9i
 URoTEdTWxx2zJs5Xe/1KZeyUnxtGukTeSuSyE53XFRPDIpkRCikafXvX7/PGuqbV
 x69K/iCHn6RssqYAOMXdDJtcVilJqPlfyH5KnoNz0XDmv2A8vRTmeZ7ULYy3yq6e
 kxGGeXyGHICp605P3FFsLgSiACkDAAW1lO2iLy0MOEMZ9U13rGG19ZCpOF8NC55Y
 Eok5AmrghD3mJqwzbrTBA8JsycC724auTwZ1J07dxS0DE1giB4BMXGe+5+yEZxp2
 pGMmzJQmegyKK9YOIOVTyj6wAXns+fMsFOVylVgl8mSPAecZiRA7kkORafsQV5ts
 I9jHd+wdXpVOh14GUf8fOayuf+YeJYj82qKCLgpy8d/WlvVUPTho8IXTSXjRU/5H
 rLA6rSLGu0LVW1AlsTh94X/H1mpX0IeYIjtE8+eWh0+Ugn5c/Wg=
 =9BzV
 -----END PGP SIGNATURE-----

Merge tag 'drm-fixes-2024-02-03' of git://anongit.freedesktop.org/drm/drm

Pul drm fixes from Dave Airlie:
 "Regular weekly fixes, mostly amdgpu and xe. One nouveau fix is a
  better fix for the deadlock and also helps with a sync race we were
  seeing.

  dma-buf:
   - heaps CMA page accounting fix

  virtio-gpu:
   - fix segment size

  xe:
   - A crash fix
   - A fix for an assert due to missing mem_acces ref
   - Only allow a single user-fence per exec / bind.
   - Some sparse warning fixes
   - Two fixes for compilation failures on various odd combinations of
     gcc / arch pointed out on LKML.
   - Fix a fragile partial allocation pointed out on LKML.
   - A sysfs ABI documentation warning fix

  amdgpu:
   - Fix reboot issue seen on some 7000 series dGPUs
   - Fix client init order for KFD
   - Misc display fixes
   - USB-C fix
   - DCN 3.5 fixes
   - Fix issues with GPU scheduler and GPU reset
   - GPU firmware loading fix
   - Misc fixes
   - GC 11.5 fix
   - VCN 4.0.5 fix
   - IH overflow fix

  amdkfd:
   - SVM fixes
   - Trap handler fix
   - Fix device permission lookup
   - Properly reserve BO before validating it

  nouveau:
   - fence/irq lock deadlock fix (second attempt)
   - gsp command size fix

* tag 'drm-fixes-2024-02-03' of git://anongit.freedesktop.org/drm/drm: (35 commits)
  nouveau: offload fence uevents work to workqueue
  nouveau/gsp: use correct size for registry rpc.
  drm/amdgpu/pm: Use inline function for IP version check
  drm/hwmon: Fix abi doc warnings
  drm/xe: Make all GuC ABI shift values unsigned
  drm/xe/vm: Subclass userptr vmas
  drm/xe: Use LRC prefix rather than CTX prefix in lrc desc defines
  drm/xe: Don't use __user error pointers
  drm/xe: Annotate mcr_[un]lock()
  drm/xe: Only allow 1 ufence per exec / bind IOCTL
  drm/xe: Grab mem_access when disabling C6 on skip_guc_pc platforms
  drm/xe: Fix crash in trace_dma_fence_init()
  drm/amdgpu: Reset IH OVERFLOW_CLEAR bit
  drm/amdgpu: remove asymmetrical irq disabling in vcn 4.0.5 suspend
  drm/amdgpu: drm/amdgpu: remove golden setting for gfx 11.5.0
  drm/amdkfd: reserve the BO before validating it
  drm/amdgpu: Fix missing error code in 'gmc_v6/7/8/9_0_hw_init()'
  drm/amd/display: Fix buffer overflow in 'get_host_router_total_dp_tunnel_bw()'
  drm/amd/display: Add NULL check for kzalloc in 'amdgpu_dm_atomic_commit_tail()'
  drm/amd: Don't init MEC2 firmware when it fails to load
  ...
2024-02-02 12:54:46 -08:00
Dave Airlie
39126abc5e nouveau: offload fence uevents work to workqueue
This should break the deadlock between the fctx lock and the irq lock.

This offloads the processing off the work from the irq into a workqueue.

Cc: linux-stable@vger.kernel.org
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/576237/
2024-02-02 17:15:47 +10:00
Dave Airlie
b5e69be185 nouveau/gsp: use correct size for registry rpc.
Timur pointed this out before, and it just slipped my mind,
but this might help some things work better, around pcie power
management.

Fixes: 8d55b0a940bb ("nouveau/gsp: add some basic registry entries.")
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/576336/
2024-02-02 17:15:16 +10:00
Dave Airlie
a639525686 amd-drm-fixes-6.8-2024-02-01:
amdgpu:
 - Fix reboot issue seen on some 7000 series dGPUs
 - Fix client init order for KFD
 - Misc display fixes
 - USB-C fix
 - DCN 3.5 fixes
 - Fix issues with GPU scheduler and GPU reset
 - GPU firmware loading fix
 - Misc fixes
 - GC 11.5 fix
 - VCN 4.0.5 fix
 - IH overflow fix
 
 amdkfd:
 - SVM fixes
 - Trap handler fix
 - Fix device permission lookup
 - Properly reserve BO before validating it
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZbvk/QAKCRC93/aFa7yZ
 2NXtAP96ISHnWQkQ+oGCU8aSrpBR2H1bLjoMtApSDKAHhmvq/QEAyWaCvtsbNB+J
 Iot1z0Xo2PkwfbUQHALQhg0W8TBueAw=
 =MqHq
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-fixes-6.8-2024-02-01' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes

amd-drm-fixes-6.8-2024-02-01:

amdgpu:
- Fix reboot issue seen on some 7000 series dGPUs
- Fix client init order for KFD
- Misc display fixes
- USB-C fix
- DCN 3.5 fixes
- Fix issues with GPU scheduler and GPU reset
- GPU firmware loading fix
- Misc fixes
- GC 11.5 fix
- VCN 4.0.5 fix
- IH overflow fix

amdkfd:
- SVM fixes
- Trap handler fix
- Fix device permission lookup
- Properly reserve BO before validating it

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240201184108.4923-1-alexander.deucher@amd.com
2024-02-02 15:30:21 +10:00
Dave Airlie
111a3f0afb UAPI Changes:
- Only allow a single user-fence per exec / bind.
   The reason for this clarification fix is a limitation in the implementation
   which can be lifted moving forward, if needed.
 
 Driver Changes:
 - A crash fix
 - A fix for an assert due to missing mem_acces ref
 - Only allow a single user-fence per exec / bind.
 - Some sparse warning fixes
 - Two fixes for compilation failures on various odd
   combinations of gcc / arch pointed out on LKML.
 - Fix a fragile partial allocation pointed out on LKML.
 
 Cross-driver Change:
 - A sysfs ABI documentation warning fix
   This also touches i915 and is acked by i915 maintainers.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRskUM7w1oG5rx2IZO4FpNVCsYGvwUCZbt/9gAKCRC4FpNVCsYG
 v5ioAP4wD+RIZ9iJ9DOG/UoQEieOje0JFjjdrwtPlUmt/kpBJQD/fvsArL0y/o3Z
 cWahxhlpkqunNSG8r7lhcLMlnrK16gw=
 =4Dmq
 -----END PGP SIGNATURE-----

Merge tag 'drm-xe-fixes-2024-02-01' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes

UAPI Changes:
- Only allow a single user-fence per exec / bind.
  The reason for this clarification fix is a limitation in the implementation
  which can be lifted moving forward, if needed.

Driver Changes:
- A crash fix
- A fix for an assert due to missing mem_acces ref
- Only allow a single user-fence per exec / bind.
- Some sparse warning fixes
- Two fixes for compilation failures on various odd
  combinations of gcc / arch pointed out on LKML.
- Fix a fragile partial allocation pointed out on LKML.

Cross-driver Change:
- A sysfs ABI documentation warning fix
  This also touches i915 and is acked by i915 maintainers.

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Thomas Hellstrom <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/ZbuCYdMDVK-kAWC5@fedora
2024-02-02 13:52:28 +10:00