1430 Commits

Author SHA1 Message Date
Ma Jun
88e5c8f874 drm/amd/pm: only check sriov vf flag once when creating hwmon sysfs
The current code checks sriov vf flag multiple times when creating
hwmon sysfs. So fix it.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-03 12:18:32 -04:00
Alex Deucher
23170863ea drm/amdgpu/smu13: drop compute workload workaround
This was fixed in PMFW before launch and is no longer
required.

Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
2023-11-03 11:59:51 -04:00
Alex Deucher
49afe91370 drm/amd: Fix UBSAN array-index-out-of-bounds for Powerplay headers
For pptable structs that use flexible array sizes, use flexible arrays.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2039926
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-03 11:59:51 -04:00
Ma Jun
42ef313754 drm/amd/pm: Return 0 as default min power limit for legacy asics
Return 0 as the default min power limit for the asics use
powerplay.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-11-03 11:37:38 -04:00
Ma Jun
7f3e6b840f drm/amd/pm: Fix error of MACO flag setting code
MACO only works if BACO is supported

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org # 6.1.x
2023-10-31 17:13:13 -04:00
Kenneth Feng
5f38ac54e6 drm/amd/pm: fix the high voltage and temperature issue
fix the high voltage and temperature issue after the driver is unloaded on smu 13.0.0,
smu 13.0.7 and smu 13.0.10
v2 - fix the code format and make sure it is used on the unload case only.

Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-31 16:40:16 -04:00
Lijo Lazar
5575ce2132 drm/amd/pm: Fix warnings
Fixes warnings:

drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_6_ppt.c:286:45:
warning: '%s' directive output may be truncated writing up to 29 bytes
into a region of size 23 [-Wformat-truncation=]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu13/smu_v13_0_6_ppt.c:286:52:
warning: '%s' directive output may be truncated writing up to 29 bytes
into a region of size 23 [-Wformat-truncation=]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c:72:45: warning:
'%s' directive output may be truncated writing up to 29 bytes into a
region of size 23 [-Wformat-truncation=]
drivers/gpu/drm/amd/amdgpu/../pm/swsmu/smu14/smu_v14_0.c:72:52: warning:
'%s' directive output may be truncated writing up to 29 bytes into a
region of size 23 [-Wformat-truncation=]

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-31 16:40:15 -04:00
Li Ma
f0b8f65b48 drm/amd/amdgpu: fix the GPU power print error in pm info
Modify the print format of the fractional part to avoid display error.

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26 19:04:45 -04:00
Lin.Cao
406e884535 drm/amd: check num of link levels when update pcie param
In SR-IOV environment, the value of pcie_table->num_of_link_levels will
be 0, and num_of_levels - 1 will cause array index out of bounds

Signed-off-by: Lin.Cao <lincao12@amd.com>
Acked-by: Jingwen Chen <Jingwen.Chen2@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26 19:04:19 -04:00
Yifan Zhang
5bd8e05fe2 drm/amd/pm: call smu_cmn_get_smc_version in is_mode1_reset_supported.
is_mode1_reset_supported may be called before smu init, when smu_context
is unitialized in driver load/unload test. Call smu_cmn_get_smc_version
explicitly in is_mode1_reset_supported.

v2: apply to aldebaran in case is_mode1_reset_supported will be
uncommented (Candice Li)

Fixes: 710d9caec70c ("drm/amd/pm: drop most smu_cmn_get_smc_version in smu")
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Candice Li <candice.li@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26 19:02:58 -04:00
Mario Limonciello
fbf1035b03 drm/amd: Disable PP_PCIE_DPM_MASK when dynamic speed switching not supported
Rather than individual ASICs checking for the quirk, set the quirk at the
driver level.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26 18:41:23 -04:00
Ma Jun
d8da213478 drm/amd/pm: Fix the return value in default case
Fix the return value in default case and drop
redundant 'break'.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26 18:41:21 -04:00
Jiadong Zhu
af0b7df70b drm/amd/pm: drop unneeded dpm features disablement for SMU 14.0.0
PMFW will handle the features disablement properly for gpu reset case,
driver involvement may cause some unexpected issues.

Signed-off-by: Jiadong Zhu <Jiadong.Zhu@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26 18:41:21 -04:00
Tao Zhou
4dd9f5404c drm/amd/pm: record mca debug mode in RAS
Call amdgpu_ras_set_mca_debug_mode when we set mca debug mode in smu
v13_0_6.

Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-26 18:41:21 -04:00
Stanley.Yang
ce43a5fa2e drm/amdgpu: Enable mca debug mode mode when ras enabled
Enable smu_v13_0_6 mca debug mode if ras is enabled.

Changed from V1:
	enable mca debug mode if ras enabled.

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20 15:11:27 -04:00
Bas Nieuwenhuizen
08e9ebc75b drm/amd/pm: Handle non-terminated overdrive commands.
The incoming strings might not be terminated by a newline
or a 0.

(found while testing a program that just wrote the string
 itself, causing a crash)

Cc: stable@vger.kernel.org
Fixes: e3933f26b657 ("drm/amd/pp: Add edit/commit/show OD clock/voltage support in sysfs")
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20 15:11:26 -04:00
Ma Jun
1958946858 drm/amd/pm: Support for getting power1_cap_min value
Support for getting power1_cap_min value on smu13 and smu11.
For other Asics, we still use 0 as the default value.

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-20 15:11:26 -04:00
Kunwu.Chan
828f8e3137 drm/amd/pm: Fix a memory leak on an error path
Add missing free on an error path.

Fixes: 511a95552ec8 ("drm/amd/pm: Add SMU 13.0.6 support")
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Kunwu.Chan <chentao@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19 18:26:51 -04:00
Asad Kamal
53dd920c1f drm/amdgpu : Add hive ras recovery check
If one of the devices in the hive detects a
fatal error, need to send ras recovery reset
message to PMFW of all devices in the hive.
For that add a flag in hive to indicate that
it's undergoing ras recovery

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19 18:26:51 -04:00
Alex Deucher
e40dd9c6b7 drm/amdgpu/pm: update SMU 13.0.0 PMFW version check
Update the PMFW version check the the ROCm optimizations.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-19 18:26:50 -04:00
Dave Airlie
27442758e9 amd-drm-next-6.7-2023-10-13:
amdgpu:
 - DC replay fixes
 - Misc code cleanups and spelling fixes
 - Documentation updates
 - RAS EEPROM Updates
 - FRU EEPROM Updates
 - IP discovery updates
 - SR-IOV fixes
 - RAS updates
 - DC PQ fixes
 - SMU 13.0.6 updates
 - GC 11.5 Support
 - NBIO 7.11 Support
 - GMC 11 Updates
 - Reset fixes
 - SMU 11.5 Updates
 - SMU 13.0 OD support
 - Use flexible arrays for bo list handling
 - W=1 Fixes
 - SubVP fixes
 - DPIA fixes
 - DCN 3.5 Support
 - Devcoredump fixes
 - VPE 6.1 support
 - VCN 4.0 Updates
 - S/G display fixes
 - DML fixes
 - DML2 Support
 - MST fixes
 - VRR fixes
 - Enable seamless boot in more cases
 - Enable content type property for HDMI
 - OLED fixes
 - Rework and clean up GPUVM TLB flushing
 - DC ODM fixes
 - DP 2.x fixes
 - AGP aperture fixes
 - SDMA firmware loading cleanups
 - Cyan Skillfish GPU clock counter fix
 - GC 11 GART fix
 - Cache GPU fault info for userspace queries
 - DC cursor check fixes
 - eDP fixes
 - DC FP handling fixes
 - Variable sized array fixes
 - SMU 13.0.x fixes
 - IB start and size alignment fixes for VCN
 - SMU 14 Support
 - Suspend and resume sequence rework
 - vkms fix
 
 amdkfd:
 - GC 11 fixes
 - GC 10 fixes
 - Doorbell fixes
 - CWSR fixes
 - SVM fixes
 - Clean up GC info enumeration
 - Rework memory limit handling
 - Coherent memory handling fixes
 - Use partial migrations in GPU faults
 - TLB flush fixes
 - DMA unmap fixes
 - GC 9.4.3 fixes
 - SQ interrupt fix
 - GTT mapping fix
 - GC 11.5 Support
 
 radeon:
 - Misc code cleanups
 - W=1 Fixes
 - Fix possible buffer overflow
 - Fix possible NULL pointer dereference
 
 UAPI:
 - Add EXT_COHERENT memory allocation flags.  These allow for system scope atomics.
   Proposed userspace: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88
 - Add support for new VPE engine.  This is a memory to memory copy engine with advanced scaling, CSC, and color management features
   Proposed mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25713
 - Add INFO IOCTL interface to query GPU faults
   Proposed Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238
   Proposed libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZSmDAQAKCRC93/aFa7yZ
 2EdeAQC2lkQ9IHLOon5kIZUK+r9IPYlgFsii+qfmMPLBaMcuwgEA8F4eJln/cc9V
 02EKhlapkggYXYa+uhOE2KTnWgMFJgI=
 =SEXq
 -----END PGP SIGNATURE-----

Merge tag 'amd-drm-next-6.7-2023-10-13' of https://gitlab.freedesktop.org/agd5f/linux into drm-next

amd-drm-next-6.7-2023-10-13:

amdgpu:
- DC replay fixes
- Misc code cleanups and spelling fixes
- Documentation updates
- RAS EEPROM Updates
- FRU EEPROM Updates
- IP discovery updates
- SR-IOV fixes
- RAS updates
- DC PQ fixes
- SMU 13.0.6 updates
- GC 11.5 Support
- NBIO 7.11 Support
- GMC 11 Updates
- Reset fixes
- SMU 11.5 Updates
- SMU 13.0 OD support
- Use flexible arrays for bo list handling
- W=1 Fixes
- SubVP fixes
- DPIA fixes
- DCN 3.5 Support
- Devcoredump fixes
- VPE 6.1 support
- VCN 4.0 Updates
- S/G display fixes
- DML fixes
- DML2 Support
- MST fixes
- VRR fixes
- Enable seamless boot in more cases
- Enable content type property for HDMI
- OLED fixes
- Rework and clean up GPUVM TLB flushing
- DC ODM fixes
- DP 2.x fixes
- AGP aperture fixes
- SDMA firmware loading cleanups
- Cyan Skillfish GPU clock counter fix
- GC 11 GART fix
- Cache GPU fault info for userspace queries
- DC cursor check fixes
- eDP fixes
- DC FP handling fixes
- Variable sized array fixes
- SMU 13.0.x fixes
- IB start and size alignment fixes for VCN
- SMU 14 Support
- Suspend and resume sequence rework
- vkms fix

amdkfd:
- GC 11 fixes
- GC 10 fixes
- Doorbell fixes
- CWSR fixes
- SVM fixes
- Clean up GC info enumeration
- Rework memory limit handling
- Coherent memory handling fixes
- Use partial migrations in GPU faults
- TLB flush fixes
- DMA unmap fixes
- GC 9.4.3 fixes
- SQ interrupt fix
- GTT mapping fix
- GC 11.5 Support

radeon:
- Misc code cleanups
- W=1 Fixes
- Fix possible buffer overflow
- Fix possible NULL pointer dereference

UAPI:
- Add EXT_COHERENT memory allocation flags.  These allow for system scope atomics.
  Proposed userspace: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/pull/88
- Add support for new VPE engine.  This is a memory to memory copy engine with advanced scaling, CSC, and color management features
  Proposed mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/25713
- Add INFO IOCTL interface to query GPU faults
  Proposed Mesa MR: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/23238
  Proposed libdrm MR: https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/298

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20231013175758.1735031-1-alexander.deucher@amd.com
2023-10-18 16:08:07 +10:00
Li Ma
49c775b783 drm/amd/swsmu: update smu v14_0_0 header files and metrics table
Update driver if, pmfw and ppsmc header files.
Add new gpu_metrics_v3_0 for metrics table updated in driver if
and reserve legacy metrics table to maintain backward compatibility.
---
v1:
Update header files and add gpu_metrics_v3_0.
v2:
Update smu_types.h, smu headers and drop smu_cmn_get_smc_version in smu v14_0_0.

Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:36:17 -04:00
Lijo Lazar
f20f3b0d6c drm/amd/pm: Add P2S tables for SMU v13.0.6
Add P2S table load support on SMU v13.0.6 ASICs.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:33:01 -04:00
Yifan Zhang
710d9caec7 drm/amd/pm: drop most smu_cmn_get_smc_version in smu
smu_check_fw_version is called in smu hw init, thus smu if version
and version are garenteed to be stored in smu context. No need to
call smu_cmn_get_smc_version again after system boot up.

Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:27:17 -04:00
Ma Jun
ce6eb957ff drm/amd/pm: Add reset option for fan_ctrl on smu 13.0.7
Add reset option for fan_ctrl interfaces on the smu v13.0.7
User can use command "echo r > interface_name" to reset the
interface to boot value

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:27:11 -04:00
Ma Jun
1007bc36ec drm/amd/pm: Add reset option for fan_ctrl interfaces
Add reset option for fan_ctrl interfaces.

For example:
User can use the "echo r > acoustic_limit_rpm_threshold" command
to reset acoustic_limit_rpm_threshold to boot value

Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:27:04 -04:00
Lang Yu
e4deccc1d1 drm/amdgpu: add support to power up/down UMSCH by SMU
Power up/down UMSCH by SMU.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Acked-by: Leo Liu <leo.liu@amd.com>
Acked-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:00:12 -04:00
Lang Yu
56d3de7da6 drm/amdgpu: add power up/down UMSCH ppt callback
Add ppt callback to power up/down UMSCH.

v2: squash in updates (Alex)

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:00:08 -04:00
Lang Yu
d60fbf2d25 drm/amdgpu: add support to powerup VPE by SMU
Powerup VPE by SMU.

Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:00:03 -04:00
Li Ma
ad3e54ab9e drm/amdgpu/discovery: add SMU 14 support
add smu 14 into the IP discovery list.

Signed-off-by: Li Ma <li.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 11:00:00 -04:00
Kenneth Feng
fe6cd91524 drm/amd/swsmu: add smu14 ip support
Add initial swSMU support for smu 14 series ASIC.

v2: squash in build fixes and updates (Li Ma)
    fix warnings (Alex)
v3: squash in updates (Alex)
v4: squash in updates (Alex)
v5: squash in avg/current power updates (Alex)

Signed-off-by: Li Ma <li.ma@amd.com>
Signed-off-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Likun Gao <Likun.Gao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 10:59:55 -04:00
Li Ma
cd6d69dd9b drm/amd/swsmu: add smu v14_0_0 pmfw if file
Add initial smu v14_0_0 pmfw if file

v2: squash in updates (Alex)

Signed-off-by: Li Ma <li.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 10:59:52 -04:00
Li Ma
7fc712f67e drm/amd/swsmu: add smu v14_0_0 ppsmc file
Add initial smu v14_0_0 ppsmc file

v2: squash in updates (Alex)
v3: squash in updates (Alex)

Signed-off-by: Li Ma <li.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 10:59:48 -04:00
Li Ma
ee26087f91 drm/amdgpu/swsmu: add smu v14_0_0 driver if file
Add initial smu v14_0_0 driver if file

v2: squash in updates (Alex)
v3: update interface (Alex)

Signed-off-by: Li Ma <li.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 10:59:38 -04:00
Asad Kamal
915414d096 drm/amd/pm: Use gpu_metrics_v1_4 for SMUv13.0.6
Use gpu_metrics_v1_4 for SMUv13.0.6 to fill
gpu metric info

v3: Removed filling gpu metric instantaneous
pcie bw

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 10:59:10 -04:00
Asad Kamal
011d99ee71 drm/amd/pm: Add gpu_metrics_v1_4
Add new gpu_metrics_v1_4 to acquire XGMI data transfer,
pcie bandwidth & Clock lock status

v2:
Add pcie error counter to gpu metric table v1_4

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 10:59:02 -04:00
Asad Kamal
79b049064a drm/amd/pm: Update metric table for smu v13_0_6
Update pmfw metric table to include xgmi transfer
data and pci instantaneous bandwidth for smu v13_0_6

v2:
Updated metric table version

v3: Removed inst pcie bw with alignment to metrics table
version 8

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 10:58:54 -04:00
Tim Huang
de7f3c4ece drm/amd/pm: wait for completion of the EnableGfxImu command
Wait for completion of sending the EnableGfxImu message
when using the PSP FW loading.

Signed-off-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-13 10:58:41 -04:00
Thomas Zimmermann
57390019b6 Merge drm/drm-next into drm-misc-next
Updating drm-misc-next to the state of Linux v6.6-rc2.

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
2023-10-11 09:50:59 +02:00
Lijo Lazar
8a2b51392a drm/amdgpu: Refactor FRU product information
Keep FRU related information together in a separate structure.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:52:08 -04:00
Alex Deucher
f8cd72728b drm/amdgpu: Enable SMU 13.0.0 optimizations when ROCm is active (v2)
When ROCm is active enable additional SMU 13.0.0 optimizations.
This reuses the unused powersave profile on PMFW.

v2: move to the swsmu code since we need both bits active in
    the workload mask.

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:48:16 -04:00
Mario Limonciello
0f0e59075b drm/amd: Fix UBSAN array-index-out-of-bounds for Polaris and Tonga
For pptable structs that use flexible array sizes, use flexible arrays.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:47:23 -04:00
Mario Limonciello
760efbca74 drm/amd: Fix UBSAN array-index-out-of-bounds for SMU7
For pptable structs that use flexible array sizes, use flexible arrays.

Suggested-by: Felix Held <felix.held@amd.com>
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2874
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-09 16:46:54 -04:00
Asad Kamal
c207c36544 drm/amd/pm: Remove set df cstate for SMUv13.0.6
Remove set df cstate as disallow df state is
not required for SMUv13.0.6

Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-05 17:59:35 -04:00
Kees Cook
a640e3c3a5 drm/amd/pm: Annotate struct smu10_voltage_dependency_table with __counted_by
Prepare for the coming implementation by GCC and Clang of the __counted_by
attribute. Flexible array members annotated with __counted_by can have
their accesses bounds-checked at run-time checking via CONFIG_UBSAN_BOUNDS
(for array indexing) and CONFIG_FORTIFY_SOURCE (for strcpy/memcpy-family
functions).

As found with Coccinelle[1], add __counted_by for struct smu10_voltage_dependency_table.

[1] https://github.com/kees/kernel-tools/blob/trunk/coccinelle/examples/counted_by.cocci

Cc: Evan Quan <evan.quan@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: "Pan, Xinhui" <Xinhui.Pan@amd.com>
Cc: David Airlie <airlied@gmail.com>
Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Xiaojian Du <Xiaojian.Du@amd.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Kevin Wang <kevin1.wang@amd.com>
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230922173216.3823169-1-keescook@chromium.org
2023-10-05 11:29:03 +02:00
Lijo Lazar
9cff0879ae drm/amd/pm: Add GC v9.4.3 thermal limits to hwmon
Publish max operating temperature of SOC and memory as temp*_emergency
nodes in hwmon. temp*_crit will show the throttle temperature limits.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-04 18:43:06 -04:00
Lijo Lazar
df7a280852 drm/amd/pm: Add throttle limit for SMU v13.0.6
CTF limit represents the max operating temperature and thermal limit
gives the limit at which throttling starts. Add support for both limits.
SOC and HBM may have different limit values.*_emergency_max gives  max
operating temperature and *_crit_max value represents throttle limit.

Signed-off-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-04 18:42:58 -04:00
Mario Limonciello
b8e6aec146 drm/amd: Drop all hand-built MIN and MAX macros in the amdgpu base driver
Several files declare MIN() or MAX() macros that ignore the types of the
values being compared.  Drop these macros and switch to min() min_t(),
and max() from `linux/minmax.h`.

Suggested-by: Hamza Mahfooz <Hamza.Mahfooz@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-04 18:39:52 -04:00
Mario Limonciello
7752ccf85b drm/amd: Update update_pcie_parameters functions to use uint8_t arguments
The matching values for `pcie_gen_cap` and `pcie_width_cap` when
fetched from powerplay tables are 1 byte, so narrow the arguments
to match to ensure min() and max() comparisons without casts.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-04 18:39:39 -04:00
Mario Limonciello
ade134ddae drm/amd: Fix logic error in sienna_cichlid_update_pcie_parameters()
While aligning SMU11 with SMU13 implementation an assumption was made that
`dpm_context->dpm_tables.pcie_table` was populated in dpm table initialization
like in SMU13 but it isn't.

So restore some of the original logic and instead just check for
amdgpu_device_pcie_dynamic_switching_supported() to decide whether to hardcode
values; erring on the side of performance.

Cc: stable@vger.kernel.org # 6.1+
Reported-and-tested-by: Umio Yasuno <coelacanth_dream@protonmail.com>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/1447#note_2101382
Fixes: e701156ccc6c ("drm/amd: Align SMU11 SMU_MSG_OverridePcieParameters implementation with SMU13")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2023-10-03 15:43:05 -04:00