IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
XGMI TA will set the capability flag to indicate whether the port_num
info is supported or not. KGD checks the flag and accordingly picks up
the right buffer format and send the right command to TA to retrieve
the info.
v2: simplify the code by reusing the same statement (lijo)
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Acked-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Per the xgmi ta implementation, KGD needs to fill in node_ids
in concern into the shared command output buffer rather than the
command input buffer.
Input buffer is not used for GET_PEER_LINKS command execution.
In this way, xgmi ta can reuse the node info in the output buffer
just filled in and populate the same buffer with link info directly.
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
PMFW will be responsible for them.
v2: remove query interfaces.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update the header file to the v20.00.00.13
v1: rename TA_COMMAND_XGMI__GET_GET_TOPOLOGY_INFO to
TA_COMMAND_XGMI__GET_TOPOLOGY_INFO
And also rename struct ta_xgmi_cmd_get_peer_link_info_output to
ta_xgmi_cmd_get_peer_link_info accordingly
v2: add structs to support xgmi GET_EXTEND_PEER_LINK command
Signed-off-by: Shiwu Zhang <shiwu.zhang@amd.com>
Reviewed-by: Le Ma <le.ma@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Record the debug mode status in RAS.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Simplify the code.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add clockgating support for NBIO ip 7.7.1
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Tim Huang <Tim.Huang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
add get_clockgating_state, update_medium_grain_light_sleep and
update_medium_grain_clock_gating in nbio_v7_11_funcs
v1:
add missing funcs in nbio_v7_11.c
v2:
modify the if condition and add spport for nbio v7.11 clockgating.
Signed-off-by: Li Ma <li.ma@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Enable RAS feature by default for aqua vanjaram on apu
platform.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
typo fix.
Fixes: 5b1270beb380 ("drm/amdgpu: add ras_err_info to identify RAS error source")
Signed-off-by: Yang Wang <kevinyang.wang@amd.com>
Reviewed-by: Candice Li <candice.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Update VCN header for RB decouple feature
- Add metadata struct, metadata will be placed after each RB
Signed-off-by: Bokun Zhang <bokun.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Add function to check if RB decouple is enabled under SRIOV
Signed-off-by: Bokun Zhang <bokun.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
- Update SRIOV header with RB decouple flag
Signed-off-by: Bokun Zhang <bokun.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Fix delete nodes that it has been freed.
Fixes: 5b1270beb380 ("drm/amdgpu: add ras_err_info to identify RAS error source")
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
display_mode_code.c is unconditionally built with
-Wframe-larger-than=2048, which causes warnings even when
CONFIG_FRAME_WARN has been set to 0, which should show no warnings.
Use the existing $(frame_warn_flag) variable, which handles this
situation. This is basically commit 25f178bbd078 ("drm/amd/display:
Respect CONFIG_FRAME_WARN=0 in dml Makefile") but for DML2.
Fixes: 7966f319c66d ("drm/amd/display: Introduce DML2")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
./drivers/gpu/drm/amd/display/dc/dml2/display_mode_core.c:4802:84-89: WARNING: conversion to bool not needed here
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6901
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
./drivers/gpu/drm/amd/display/dc/hwss/dce110/dce110_hwseq.c: dce110_hwseq.h is included more than once.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=6897
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn35/dcn35_fpu.c:261 dcn35_update_bw_bounding_box_fpu() warn: inconsistent indenting
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The incoming strings might not be terminated by a newline
or a 0.
(found while testing a program that just wrote the string
itself, causing a crash)
Cc: stable@vger.kernel.org
Fixes: e3933f26b657 ("drm/amd/pp: Add edit/commit/show OD clock/voltage support in sysfs")
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Set VCN/JPEG RAS masks to enable software RAS for
VCN and JPEG.
Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Make the code architecture more simple.
v2: reuse ras_reset_error_count in ras_reset_error_status.
Signed-off-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Function svm_range_split_by_grinity is not used,
so it is removed.
Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com>
Suggested-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Support for getting power1_cap_min value on smu13 and smu11.
For other Asics, we still use 0 as the default value.
Signed-off-by: Ma Jun <Jun.Ma2@amd.com>
Reviewed-by: Kenneth Feng <kenneth.feng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Support replay mode where UE could be converted to CE.
Signed-off-by: Candice Li <candice.li@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In amdgpu_dma_buf_move_notify reserve fences for the page table updates
in amdgpu_vm_clear_freed and amdgpu_vm_handle_moved. This fixes a BUG_ON
in dma_resv_add_fence when using SDMA for page table updates.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
abo->tbo.resource may be NULL in amdgpu_vm_bo_update.
Fixes: 180253782038 ("drm/ttm: stop allocating dummy resources during BO creation")
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This is workaround, kiq ring test failed in suspend stage when do ras
recovery.
Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Tao Zhou <tao.zhou1@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
There are a few spelling mistakes and an minor grammatical issue in
some dml_print messages. Fix these.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Many of the register macros defined ind dcn32_resource.h have
extra brackets. This is not conforming to the style of those
defined in other DC header files.
[How]
Remove these brackets in dcn32_resource.h
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Stylon Wang <stylon.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If the IMU version wasn't discovered from the header, such as when
the firmware was directly loaded by PSP then there is no firmware
version to show to userspace from sysfs or IOCTL.
The IMU F/W stores the version in the first scratch register though,
so fetch it in these cases to let the driver export.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
When the IMU ucode is loaded by the PSP parsing the version that comes from
Linux will vary. Rather than showing the wrong data to kernel interface
consumers, avoid populating it in this case.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The intention for early init is to find any missing microcode early
and fail the driver load if it's missing. Move this step to earlier
in driver init to match other IP blocks.
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Increase retry time for PSP BL wait, to compensate
for longer time to set c2pmsg 35 ready bit during
mode1 with RAS
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
prepare_suspend() is intended to be used for any IP blocks
that must allocate memory during the suspend sequence.
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Closes: https://lore.kernel.org/all/20231017143555.6a6450fc@canb.auug.org.au/
Fixes: cb11ca3233aa ("drm/amd: Add concept of running prepare_suspend() sequence for IP blocks")
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add missing free on an error path.
Fixes: 511a95552ec8 ("drm/amd/pm: Add SMU 13.0.6 support")
Reviewed-by: Yang Wang <kevinyang.wang@amd.com>
Signed-off-by: Kunwu.Chan <chentao@kylinos.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
It's not enabled in hardware so the code is dead.
Remove it.
Reviewed-by: Jack Xiao <Jack.Xiao@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
If one of the devices in the hive detects a
fatal error, need to send ras recovery reset
message to PMFW of all devices in the hive.
For that add a flag in hive to indicate that
it's undergoing ras recovery
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why & How]
A critial part of "drm/amd/display: Fix windowed MPO video with ODM
combine for DCN32" is lost during promotion to upstream. This patch
addes the code back to dc.c.
Signed-off-by: Stylon Wang <stylon.wang@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This reverts commit 0bdebfef3fb2b6291000765eaa9c6c8030293fce.
XCP_CTL register is programmed by firmware and
register access is protected.
Signed-off-by: Mangesh Gadre <Mangesh.Gadre@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Reviewed-by: Asad Kamal <asad.kamal@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Add missing IP callbacks.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Veerabadhran Gopalakrishnan <Veerabadhran.Gopalakrishnan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Update the PMFW version check the the ROCm optimizations.
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The notable changes are:
- uAPI changes:
- Expose tsc clock sampling to better sync clock information in profiler.
- Enhance engine error reporting in the info ioctl.
- Block access to the eventfd operations through the control device.
- Disable the option of the user to register multiple times with the same
offset for timestamp dump by the driver. If a user wants to use the same
offset in the timestamp buffer for different interrupt, it needs to first
de-register the offset.
- When exporting dma-buf (for p2p), force the user to specify size/offset
in multiples of PAGE_SIZE. This is instead of the driver doing the
rounding to PAGE_SIZE, which has caused the driver to map more memory
than was intended by the user.
- New features and improvements:
- Complete the move of the driver to the accel subsystem by removing the
custom habanalabs class and major and registering to accel subsystem.
- Move the firmware interface files to include/linux/habanalabs. This is
a pre-requisite for upstreaming the NIC drivers of Gaudi (as they need to
include those files).
- Perform device hard-reset upon PCIe AXI drain event to prevent the failure
from cascading to different IP blocks in the SoC. In secured environments,
this is done automatically by the firmware.
- Print device name when it is removed for better debuggability.
- Add support for trace of dma map sgtable operations.
- Optimize handling of user interrupts by splitting the interrupts to two
lists. One list for fast handling and second list for handling with
timestamp recording, which is slower.
- Prevent double device hard-reset due to 2 adjacent H/W events.
- Set device status 'malfunction' while in rmmod.
- Firmware related fixes:
- Extend preboot timeout because preboot loading might take longer than
expected in certain cases.
- Add a protection mechanism for the Event Queue. In case it is full, the
firmware will be able to notify about it through a dedicated interrupt.
- Perform device hard-reset in case scrubbing of memory has failed.
- Bug fixes and code cleanups:
- Small fixes of dma-buf handling in Gaudi2, such as handling an offset != 0,
using the correct exported size, creation of sg table.
- Fix spmu mask creation.
- Fix bug in wait for cs completion for decoder workloads.
- Cleanup Greco name from documentation.
- Fix bug in recording timestamp during cs completion interrupt handling.
- Fix CoreSight ETF configuration and flush logic.
- Fix small bug in hpriv_list handling (the list that contains the private
data per process that opens our device).
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEE7TEboABC71LctBLFZR1NuKta54AFAmUlHoQACgkQZR1NuKta
54DsXQf8CW+W4iWJf5UDTj/E/giu9rVRrsUsU0hhCcXbecIxRsLObYXtulENu5/u
VuEAo/tAvo0LUKi8pdIv6ernDKaxZ1+fimlfXMCzllAA/ts3yp1NgunprsIsx3tv
YgcJ2GNR8UlVZ1qYuZl+4dOTyD0yfRMROUXBe7wqKnUXOEepOiLBxq6W15tZiJnx
L+V0yGkNk6pAoADIXLW9EgEXiN/bJZCXGPWp06i/Nz7cHIHJGoV59wAqftqllCtk
8ZMkLByjlQKPhc5AgWBtKE8EGVip3sm7b/Q2Gq0ZXdZiebyVJ+AjuuDOdtq1UCIw
Rcp2576E7rByIBu3RAFlrioWhuR5Zw==
=2ien
-----END PGP SIGNATURE-----
Merge tag 'drm-habanalabs-next-2023-10-10' of https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux into drm-next
This tag contains habanalabs driver changes for v6.7.
The notable changes are:
- uAPI changes:
- Expose tsc clock sampling to better sync clock information in profiler.
- Enhance engine error reporting in the info ioctl.
- Block access to the eventfd operations through the control device.
- Disable the option of the user to register multiple times with the same
offset for timestamp dump by the driver. If a user wants to use the same
offset in the timestamp buffer for different interrupt, it needs to first
de-register the offset.
- When exporting dma-buf (for p2p), force the user to specify size/offset
in multiples of PAGE_SIZE. This is instead of the driver doing the
rounding to PAGE_SIZE, which has caused the driver to map more memory
than was intended by the user.
- New features and improvements:
- Complete the move of the driver to the accel subsystem by removing the
custom habanalabs class and major and registering to accel subsystem.
- Move the firmware interface files to include/linux/habanalabs. This is
a pre-requisite for upstreaming the NIC drivers of Gaudi (as they need to
include those files).
- Perform device hard-reset upon PCIe AXI drain event to prevent the failure
from cascading to different IP blocks in the SoC. In secured environments,
this is done automatically by the firmware.
- Print device name when it is removed for better debuggability.
- Add support for trace of dma map sgtable operations.
- Optimize handling of user interrupts by splitting the interrupts to two
lists. One list for fast handling and second list for handling with
timestamp recording, which is slower.
- Prevent double device hard-reset due to 2 adjacent H/W events.
- Set device status 'malfunction' while in rmmod.
- Firmware related fixes:
- Extend preboot timeout because preboot loading might take longer than
expected in certain cases.
- Add a protection mechanism for the Event Queue. In case it is full, the
firmware will be able to notify about it through a dedicated interrupt.
- Perform device hard-reset in case scrubbing of memory has failed.
- Bug fixes and code cleanups:
- Small fixes of dma-buf handling in Gaudi2, such as handling an offset != 0,
using the correct exported size, creation of sg table.
- Fix spmu mask creation.
- Fix bug in wait for cs completion for decoder workloads.
- Cleanup Greco name from documentation.
- Fix bug in recording timestamp during cs completion interrupt handling.
- Fix CoreSight ETF configuration and flush logic.
- Fix small bug in hpriv_list handling (the list that contains the private
data per process that opens our device).
Signed-off-by: Dave Airlie <airlied@redhat.com>
# -----BEGIN PGP SIGNATURE-----
#
# iQEzBAABCgAdFiEE7TEboABC71LctBLFZR1NuKta54AFAmUlHoQACgkQZR1NuKta
# 54DsXQf8CW+W4iWJf5UDTj/E/giu9rVRrsUsU0hhCcXbecIxRsLObYXtulENu5/u
# VuEAo/tAvo0LUKi8pdIv6ernDKaxZ1+fimlfXMCzllAA/ts3yp1NgunprsIsx3tv
# YgcJ2GNR8UlVZ1qYuZl+4dOTyD0yfRMROUXBe7wqKnUXOEepOiLBxq6W15tZiJnx
# L+V0yGkNk6pAoADIXLW9EgEXiN/bJZCXGPWp06i/Nz7cHIHJGoV59wAqftqllCtk
# 8ZMkLByjlQKPhc5AgWBtKE8EGVip3sm7b/Q2Gq0ZXdZiebyVJ+AjuuDOdtq1UCIw
# Rcp2576E7rByIBu3RAFlrioWhuR5Zw==
# =2ien
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 10 Oct 2023 19:51:00 AEST
# gpg: using RSA key ED311BA00042EF52DCB412C5651D4DB8AB5AE780
# gpg: Can't check signature: No public key
From: Oded Gabbay <ogabbay@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/ZSUfiX4J7v4Wn0cU@ogabbay-vm-u22.habana-labs.com