linux

iv/linux

Author	SHA1	Message	Date
Surbhi Kakarya	9256e8d47a	drm/amd: Disable XNACK on SRIOV environment The purpose of this patch is to disable XNACK or set XNACK OFF mode on SRIOV platform which doesn't support it. This will prevent user-space application to fail or result into unexpected behaviour whenever the application need to run test-case in XNACK ON mode. Signed-off-by: Surbhi Kakarya <surbhi.kakarya@amd.com> Reviewed-by: Shaoyun Liu <shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-11-07 11:15:37 -05:00
Bokun Zhang	fc3136730b	drm/amd/amdgpu/vcn: Add RB decouple feature under SRIOV - P2 - Add function to check if RB decouple is enabled under SRIOV Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-10-20 15:11:27 -04:00
Horace Chen	83f24a8f05	drm/amdgpu: set sw state to gfxoff after SR-IOV reset [Why] Current SR-IOV will not set GC to off state, while it is a real GC hard reset. Whthout GFX off flag, driver may do gfxhub invalidation before firmware load and gfxhub gart enable. This operation may cause CP to become busy because GC is not in the right state for invalidation. [How] Add a function for SR-IOV to clean up some sw state before recover. Set adev->gfx.is_poweron to false to prevent gfxhub invalidation before gfx firmware autoload complete. Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: HaiJun Chang <HaiJun.Chang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-25 13:35:23 -04:00
Victor Lu	8ed49dd1d3	drm/amdgpu: Add RLCG interface driver implementation for gfx v9.4.3 (v3) Add RLCG interface support for gfx v9.4.3 and multiple XCCs. Do not enable it yet. v2: Fix amdgpu_rlcg_reg_access_ctrl init, add support for multiple XCCs in amdgpu_mm_wreg_mmio_rlc v3: Use GET_INST() when indexing amdgpu_rlcg_reg_access_ctrl Signed-off-by: Victor Lu <victorchengchi.lu@amd.com> Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-07-18 11:16:41 -04:00
Jane Jian	d71e38df3b	drm/amdgpu/vcn: custom video info caps for sriov for sriov, we added a new flag to indicate av1 support, this will override the original caps info. Signed-off-by: Jane Jian <Jane.Jian@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2023-03-14 10:37:09 -04:00
Tao Zhou	8ede944da6	drm/amdgpu: add RAS poison consumption handler for AI SRIOV Send message to host and host will handle it. v2: split the patch into two parts, one is for mxgpu ai and another one is for common poison consumption handler. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-12-15 12:18:19 -05:00
Nathan Chancellor	f0d0f10873	drm/amdgpu: Fix type of second parameter in trans_msg() callback With clang's kernel control flow integrity (kCFI, CONFIG_CFI_CLANG), indirect call targets are validated against the expected function pointer prototype to make sure the call target is valid to help mitigate ROP attacks. If they are not identical, there is a failure at run time, which manifests as either a kernel panic or thread getting killed. A proposed warning in clang aims to catch these at compile time, which reveals: drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c:412:15: error: incompatible function pointer types initializing 'void ()(struct amdgpu_device , u32, u32, u32, u32)' (aka 'void ()(struct amdgpu_device , unsigned int, unsigned int, unsigned int, unsigned int)') with an expression of type 'void (struct amdgpu_device , enum idh_request, u32, u32, u32)' (aka 'void (struct amdgpu_device , enum idh_request, unsigned int, unsigned int, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict] .trans_msg = xgpu_ai_mailbox_trans_msg, ^~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. drivers/gpu/drm/amd/amdgpu/mxgpu_nv.c:435:15: error: incompatible function pointer types initializing 'void ()(struct amdgpu_device , u32, u32, u32, u32)' (aka 'void ()(struct amdgpu_device , unsigned int, unsigned int, unsigned int, unsigned int)') with an expression of type 'void (struct amdgpu_device , enum idh_request, u32, u32, u32)' (aka 'void (struct amdgpu_device , enum idh_request, unsigned int, unsigned int, unsigned int)') [-Werror,-Wincompatible-function-pointer-types-strict] .trans_msg = xgpu_nv_mailbox_trans_msg, ^~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. The type of the second parameter in the prototype should be 'enum idh_request' instead of 'u32'. Update it to clear up the warnings. Link: https://github.com/ClangBuiltLinux/linux/issues/1750 Reported-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Kees Cook <keescook@chromium.org> Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-11-04 16:05:53 -04:00
Danijel Slivka	a7310d8de3	drm/amdgpu: set vm_update_mode=0 as default for Sienna Cichlid in SRIOV case For asic with VF MMIO access protection avoid using CPU for VM table updates. CPU pagetable updates have issues with HDP flush as VF MMIO access protection blocks write to mmBIF_BX_DEV0_EPF0_VF0_HDP_MEM_COHERENCY_FLUSH_CNTL register during sriov runtime. v3: introduce virtualization capability flag AMDGPU_VF_MMIO_ACCESS_PROTECT which indicates that VF MMIO write access is not allowed in sriov runtime Signed-off-by: Danijel Slivka <danijel.slivka@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-10-17 17:41:20 -04:00
Horace Chen	f8bd73213a	drm/amdgpu: Support PSP 13.0.10 on SR-IOV Add support for PSP 13.0.10 for SR-IOV VF Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:26 -04:00
Horace Chen	d9d86d085f	drm/amdgpu: refine virtualization psp fw skip check SR-IOV may need to load different firmwares for different ASIC inside VF. So create a new function in amdgpu_virt to check whether FW load needs to be skipped. Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Horace Chen <horace.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-09-01 15:11:12 -04:00
Victor Skvortsov	aa79d3808e	drm/amdgpu: Fix wait for RLCG command completion if (!(tmp & flag)) condition will always evaluate to true when the flag is 0x0 (AMDGPU_RLCG_GC_WRITE). Instead check that address bits are cleared to determine whether the command is complete. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Tested-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed by: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-02-16 17:30:02 -05:00
Victor Zhao	039cacd239	drm/amdgpu: add determine passthrough under arm64 add determine for passthrough mode under arm64 by reading CurrentEL register v2: squash in warning fix (Alex) Signed-off-by: Victor Zhao <Victor.Zhao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-27 15:47:34 -05:00
Hawking Zhang	381519dff8	drm/amdgpu: retire rlc callbacks sriov_rreg/wreg Not needed anymore. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	5d447e2967	drm/amdgpu: add helper for rlcg indirect reg access The helper will be used to access registers from sriov guest in full access time Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Hawking Zhang	29dbcac82f	drm/amdgpu: add helper to query rlcg reg access flag Query rlc indirect register access approach specified by sriov host driver per ip blocks Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Zhou, Peng Ju <PengJu.Zhou@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2022-01-25 18:00:33 -05:00
Victor Skvortsov	892deb4826	drm/amdgpu: Separate vf2pf work item init from virt data exchange We want to be able to call virt data exchange conditionally after gmc sw init to reserve bad pages as early as possible. Since this is a conditional call, we will need to call it again unconditionally later in the init sequence. Refactor the data exchange function so it can be called multiple times without re-initializing the work item. v2: Cleaned up the code. Kept the original call to init_exchange_data() inside early init to initialize the work item, afterwards call exchange_data() when needed. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed By: Shaoyun.liu <Shaoyun.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-12-16 14:08:20 -05:00
Bokun Zhang	ed9d205363	drm/amdgpu: Complete multimedia bandwidth interface - Update SRIOV PF2VF header with latest revision - Extend existing function in amdgpu_virt.c to read MM bandwidth config from PF2VF message - Add SRIOV Sienna Cichlid codec array and update the bandwidth with PF2VF message v2: squash in removal of unused variable (Alex) Signed-off-by: Bokun Zhang <bokun.zhang@amd.com> Reviewed-by: Monk liu <monk.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-05-19 22:29:58 -04:00
Peng Ju Zhou	5d23851029	drm/amdgpu: indirect register access for nv12 sriov using the control bits got from host to control registers access. Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-09 16:50:13 -04:00
Peng Ju Zhou	8b8a162da8	drm/amdgpu: indirect register access for nv12 sriov unify host driver and guest driver indirect access control bits names Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Emily.Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-09 16:50:06 -04:00
Rohit Khaire	4d675e1eb8	drm/amdgpu: Add new PF2VF flags for VF register access method Add 3 sub flags to notify guest for indirect reg access of gc, mmhub and ih The host sets these flags depending on L1 RAP version, asic and other scenarios. These flags ensure that there is compatibility between different guest/host/vbios versions. Signed-off-by: Rohit Khaire <rohit.khaire@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Luben Tuikov <luben.tuikov@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-09 16:49:22 -04:00
Bokun Zhang	519b8b76f0	drm/amdgpu: Implement new guest side VF2PF message transaction (v2) - Refactor the driver code to use amdgpu_virt_read_pf2vf_data and amdgpu_virt_write_vf2pf_data instead of writing all code in one function (which is the old amdgpu_virt_init_data_exchange) - Adding a new transaction method for VF2PF message between host and guest driver. Guest side will periodically update VF2PF message in the framebuffer. In the new header, we include guest ucode information, guest framebuffer usage, and engine usage - Clean up the old macros since they will cause compile error if the new transaction method is used v2: squash in build fix Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 17:03:22 -04:00
Bokun Zhang	1721bc1b2a	drm/amdgpu: Update VF2PF interface - Update guest side VF2PF interface header file Signed-off-by: Bokun Zhang <Bokun.Zhang@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-25 16:55:44 -04:00
Dennis Li	53b3f8f40e	drm/amdgpu: refine codes to avoid reentering GPU recovery if other threads have holden the reset lock, recovery will fail to try_lock. Therefore we introduce atomic hive->in_reset and adev->in_gpu_reset, to avoid reentering GPU recovery. v2: drop "? true : false" in the definition of amdgpu_in_reset Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:22:56 -04:00
Christian König	f1403342eb	drm/amdgpu: revert "fix system hang issue during GPU reset" The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Dennis Li	df9c8d1aa2	drm/amdgpu: fix system hang issue during GPU reset when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com> Suggested-by: Luben Tukov <luben.tuikov@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:37 -04:00
Stanley.Yang	5278a159cf	drm/amdgpu: support reserve bad page for virt (v3) v1: rename some functions name, only init ras error handler data for supported asic. v2: fix potential memory leak. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:17 -04:00
Kevin Wang	a7f2810337	drm/amdgpu: add amdgpu_virt_get_vf_mode helper function the swsmu or powerplay(hwmgr) need to handle task according to different VF mode, this function to help query vf mode. vf mode: 1. SRIOV_VF_MODE_BARE_METAL: the driver work on host OS (PF) 2. SRIOV_VF_MODE_ONE_VF : the driver work on guest OS with one VF 3. SRIOV_VF_MODE_MULTI_VF : the driver work on guest OS with multi VF Signed-off-by: Kevin Wang <kevin1.wang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-18 11:23:52 -04:00
Yintian Tao	d32709dac6	drm/amdgpu: resume kiq access debugfs If there is no GPU hang, user still can access debugfs through kiq. Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Monk Liu <Monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:01:56 -04:00
Yintian Tao	95a2f91738	drm/amdgpu: restrict debugfs register access under SR-IOV Under bare metal, there is no more else to take care of the GPU register access through MMIO. Under Virtualization, to access GPU register is implemented through KIQ during run-time due to world-switch. Therefore, under SR-IOV user can only access debugfs to r/w GPU registers when meets all three conditions below. - amdgpu_gpu_recovery=0 - TDR happened - in_gpu_reset=0 v2: merge amdgpu_virt_can_access_debugfs() into amdgpu_virt_enable_access_debugfs() v3: drop ret variable in amdgpu_virt_enable_access_debugfs() and directly return result Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-13 12:01:04 -04:00
Monk Liu	aa53bc2edb	drm/amdgpu: introduce new request and its function 1) modify xgpu_nv_send_access_requests to support new idh request 2) introduce new function: req_gpu_init_data() which is used to notify host to prepare vbios/ip-discovery/pfvf exchange Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:43 -04:00
Monk Liu	3aa0115d23	drm/amdgpu: cleanup all virtualization detection routine we need to move virt detection much earlier because: 1) HW team confirms us that RCC_IOV_FUNC_IDENTIFIER will always be at DE5 (dw) mmio offset from vega10, this way there is no need to implement detect_hw_virt() routine in each nbio/chip file. for VI SRIOV chip (tonga & fiji), the BIF_IOV_FUNC_IDENTIFIER is at 0x1503 2) we need to acknowledged we are SRIOV VF before we do IP discovery because the IP discovery content will be updated by host everytime after it recieved a new coming "REQ_GPU_INIT_DATA" request from guest (there will be patches for this new handshake soon). Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
Monk Liu	b89659b783	drm/amdgpu: amends feature bits for MM bandwidth mgr Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-01 14:44:42 -04:00
Monk Liu	2e0cc4d48b	drm/amdgpu: revise RLCG access path what changed: 1)provide new implementation interface for the rlcg access path 2)put SQ_CMD/SQ_IND_INDEX to GFX9 RLCG path to let debugfs's reg_op function can access reg that need RLCG path help now even debugfs's reg_op can used to dump wave. tested-by: Monk Liu <monk.liu@amd.com> tested-by: Zhou pengju <pengju.zhou@amd.com> Signed-off-by: Zhou pengju <pengju.zhou@amd.com> Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-16 16:17:55 -04:00
chen gong	d33a99c4b6	drm/amdgpu: provide a generic function interface for reading/writing register by KIQ Move amdgpu_virt_kiq_rreg/amdgpu_virt_kiq_wreg function to amdgpu_gfx.c, and rename them to amdgpu_kiq_rreg/amdgpu_kiq_wreg.Make it generic and flexible. Signed-off-by: chen gong <curry.gong@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-22 16:34:14 -05:00
Yintian Tao	c9ffa427db	drm/amd/powerplay: enable pp one vf mode for vega10 Originally, due to the restriction from PSP and SMU, VF has to send message to hypervisor driver to handle powerplay change which is complicated and redundant. Currently, SMU and PSP can support VF to directly handle powerplay change by itself. Therefore, the old code about the handshake between VF and PF to handle powerplay will be removed and VF will use new the registers below to handshake with SMU. mmMP1_SMN_C2PMSG_101: register to handle SMU message mmMP1_SMN_C2PMSG_102: register to handle SMU parameter mmMP1_SMN_C2PMSG_103: register to handle SMU response v2: remove module parameter pp_one_vf v3: fix the parens v4: forbid vf to change smu feature v5: use hwmon_attributes_visible to skip sepicified hwmon atrribute v6: change skip condition at vega10_copy_table_to_smc Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
Monk Liu	4cd4c5c064	drm/amdgpu: cleanup vega10 SRIOV code path we can simplify all those unnecessary function under SRIOV for vega10 since: 1) PSP L1 policy is by force enabled in SRIOV 2) original logic always set all flags which make itself a dummy step besides, 1) the ih_doorbell_range set should also be skipped for VEGA10 SRIOV. 2) the gfx_common registers should also be skipped for VEGA10 SRIOV. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-08-02 10:17:21 -05:00
Jack Xiao	43974dacb6	drm/amdgpu: program for resuming preempted ib For new submission ib, CE/DE metadata should be programmed to 0; for partially execution ib, CE/DE metadata should be restored. Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-21 18:57:40 -05:00
Trigger Huang	78d4811267	drm/amdgpu: init vega10 SR-IOV reg access mode Set different register access mode according to the features provided by firmware Signed-off-by: Trigger Huang <Trigger.Huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-05-24 12:20:50 -05:00
Yintian Tao	bb5a2bdf36	drm/amdgpu: support dpm level modification under virtualization v3 Under vega10 virtualuzation, smu ip block will not be added. Therefore, we need add pp clk query and force dpm level function at amdgpu_virt_ops to support the feature. v2: add get_pp_clk existence check and use kzalloc to allocate buf v3: return -ENOMEM for allocation failure and correct the coding style Signed-off-by: Yintian Tao <yttao@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-04-10 13:53:27 -05:00
Emily Deng	bed1ed366d	drm/amd/amdgpu/sriov: Aligned the definition with libgv Aligned the amd_sriov_msg_pf2vf_info_header and amd_sriov_msg_pf2vf_info_header's definition with libgv. Signed-off-by: Emily Deng <Emily.Deng@amd.com> Reviewed-by: Frank.Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-19 15:27:42 -05:00
Christian König	af5fe1e96a	drm/amdgpu: cleanup GMC v9 TLB invalidation Move the kiq handling into amdgpu_virt.c and drop the fallback. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Emily Deng <Emily.Deng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 15:49:40 -05:00
Rex Zhu	7946340fa3	drm/amdgpu: Move csa related code to separate file In baremetal, also need to reserve csa for preemption. so move the csa related code out of sriov. Reviewed-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:21:49 -05:00
Rex Zhu	1e256e2762	drm/amdgpu: Refine CSA related functions There is no functional changes, Use function arguments for SRIOV special variables which is hardcode in those functions. so we can share those functions in baremetal. Reviewed-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:21:48 -05:00
Rex Zhu	20bedfe0c1	drm/amdgpu: Remove useless csa gpu address in vmid0 driver didn't use this address so far. Reviewed-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-11-05 14:21:48 -05:00
Christian König	6f05c4e9d1	drm/amdgpu: move static CSA address to top of address space v2 Move the CSA area to the top of the VA space to avoid clashing with HMM/ATC in the lower range on GFX9. v2: wrong sign noticed by Roger, rebase on CSA_VADDR cleanup, handle VA hole on GFX9 as well. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Monk Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2018-02-19 14:18:48 -05:00
Monk Liu	84e5b5161e	drm/amdgpu:free CSA in unified place instead of doing it in each GFX ip's sw_fini Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-06 12:47:51 -05:00
Monk Liu	75bc6099bc	drm/amdgpu:read VRAMLOST from gim Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:45 -05:00
Monk Liu	13a752e3a2	drm/amdgpu:cleanup in_sriov_reset and lock_reset since now gpu reset is unified with gpu_recover for both bare-metal and SR-IOV: 1)rename in_sriov_reset to in_gpu_reset 2)move lock_reset from adev->virt to adev Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:31 -05:00
Monk Liu	5740682e66	drm/amdgpu:implement new GPU recover(v3) 1,new imple names amdgpu_gpu_recover which gives more hint on what it does compared with gpu_reset 2,gpu_recover unify bare-metal and SR-IOV, only the asic reset part is implemented differently 3,gpu_recover will increase hang job karma and mark its entity/context as guilty if exceeds limit V2: 4,in scheduler main routine the job from guilty context will be immedialy fake signaled after it poped from queue and its fence be set with "-ECANCELED" error 5,in scheduler recovery routine all jobs from the guilty entity would be dropped 6,in run_job() routine the real IB submission would be skipped if @skip parameter equales true or there was VRAM lost occured. V3: 7,replace deprecated gpu reset, use new gpu recover Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:41:30 -05:00
pding	b636176efd	drm/amdgpu/virt: add wait_reset virt ops Driver can use this interface to check if there's a function level reset done in hypervisor. It's helpful when IRQ handler for reset is not ready, or special handling is required. Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Monk Liu <monk.liu@amd.com> Signed-off-by: pding <Pixel.Ding@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-12-04 16:33:13 -05:00

1 2

73 Commits