linux

iv/linux

Author	SHA1	Message	Date
Philip Yang	ff891a2e64	drm/amdkfd: check access permisson to restore retry fault Check range access permission to restore GPU retry fault, if GPU retry fault on address which belongs to VMA, and VMA has no read or write permission requested by GPU, failed to restore the address. The vm fault event will pass back to user space. Signed-off-by: Philip Yang <Philip.Yang@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-08-24 15:36:50 -04:00
Tao Zhou	9dbd8a1251	drm/amdgpu: add cyan_skillfish support in gmc v10 Add gmc support for cyan_skillfish. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-07-23 10:08:01 -04:00
Xiaomeng Hou	b3accd6f66	drm/amdgpu: add gpu harvest support for yellow carp (v2) Register callback in gfxhub functions to program the bypass groups in gc_utcl2 corresponding to harvested SA. v2: update comments (Alex) Signed-off-by: Xiaomeng Hou <Xiaomeng.Hou@amd.com> Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-04 16:03:24 -04:00
Huang Rui	e15a5fb9b6	drm/amdgpu: introduce a stolen reserved buffer to protect specific buffer region (v2) Some ASICs such as Yellow Carp needs to reserve a region of video memory to avoid access from driver. So this patch is to introduce a stolen reserved buffer to protect specific buffer region. v2: free this buffer in amdgpu_ttm_fini. Signed-off-by: Huang Rui <ray.huang@amd.com> Acked-and-Tested-by: Aaron Liu <aaron.liu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-04 16:03:12 -04:00
Aaron Liu	c817cfa313	drm/amdgpu: add gmc v10 supports for yellow carp Add gfx memory controller support for yellow carp. Signed-off-by: Aaron Liu <aaron.liu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-06-04 16:03:07 -04:00
Dave Airlie	5745d647d5	Merge tag 'amd-drm-next-5.14-2021-06-02' of https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-5.14-2021-06-02: amdgpu: - GC/MM register access macro clean up for SR-IOV - Beige Goby updates - W=1 Fixes - Aldebaran fixes - Misc display fixes - ACPI ATCS/ATIF handling rework - SR-IOV fixes - RAS fixes - 16bpc fixed point format support - Initial smartshift support - RV/PCO power tuning fixes for suspend/resume - More buffer object subclassing work - Add new INFO query for additional vbios information - Add new placement for preemptable SG buffers amdkfd: - Misc fixes radeon: - W=1 Fixes - Misc cleanups UAPI: - Add new INFO query for additional vbios information Useful for debugging vbios related issues. Proposed umr patch: https://patchwork.freedesktop.org/patch/433297/ - 16bpc fixed point format support IGT test: https://lists.freedesktop.org/archives/igt-dev/2021-May/031507.html Proposed Vulkan patch: `a25d480207` - Add a new GEM flag which is only used internally in the kernel driver. Userspace is not allowed to set it. drm: - 16bpc fixed point format fourcc Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210602214009.4553-1-alexander.deucher@amd.com	2021-06-04 06:13:57 +10:00
Thomas Zimmermann	304ba5dca4	Merge drm/drm-next into drm-misc-next Backmerging from drm/drm-next to the patches for AMD devices for v5.14. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>	2021-05-22 07:17:05 +02:00
Lee Jones	2cce318c3b	drm/amd/amdgpu/gmc_v10_0: Fix potential copy/paste issue Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c:955: warning: expecting prototype for gmc_v8_0_gart_fini(). Prototype was for gmc_v10_0_gart_fini() instead Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-05-21 10:32:17 -04:00
Peng Ju Zhou	6ba3f59eb4	drm/amdgpu: Modify GC register access from MMIO to RLCG in file amdgpu_gmc.c In SRIOV environment, KMD should access GC registers with RLCG if GC indirect access flag enabled. Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-05-21 10:32:08 -04:00
Andrey Grodzovsky	d10d0daa20	drm/amdgpu: Handle IOMMU enabled case. Problem: Handle all DMA IOMMU group related dependencies before the group is removed. Those manifest themself in that when IOMMU enabled DMA map/unmap is dependent on the presence of IOMMU group the device belongs to but, this group is released once the device is removed from PCI topology. Fix: Expedite all such unmap operations to pci remove driver callback. v5: Drop IOMMU notifier and switch to lockless call to ttm_tt_unpopulate v6: Drop the BO unamp list v7: Drop amdgpu_gart_fini In amdgpu_ih_ring_fini do uncinditional check (!ih->ring) to avoid freeing uniniitalized rings. Call amdgpu_ih_ring_fini unconditionally. v8: Add deatiled explanation Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210517143851.475058-1-andrey.grodzovsky@amd.com	2021-05-19 23:50:27 -04:00
Chengming Gui	2d527ea6fd	drm/amd/amdgpu: add gmc ip block for beige_goby Enable gmc block for beige_goby, same as sienna_cichlid Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-05-19 22:40:17 -04:00
Chengming Gui	d2bfc50de2	drm/amd/amdgpu: add gmc support for beige_goby Same as dimgrey_cavefish Signed-off-by: Chengming Gui <Jack.Gui@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-05-19 22:40:03 -04:00
Peng Ju Zhou	d9c7f753b8	drm/amdgpu: Refine the error report when flush tlb. there are 2 hubs to flush in the gmc, to make it easier to debug when hub flush failed, refine the logs. Signed-off-by: Peng Ju Zhou <PengJu.Zhou@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-05-19 22:38:03 -04:00
Oak Zeng	0ca565ab97	drm/amdgpu: Calling address translation functions to simplify codes Use amdgpu_gmc_vram_pa and amdgpu_gmc_vram_cpu_pa to simplify codes. No logic change. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-15 16:03:01 -04:00
Hawking Zhang	49070c4ea3	drm/amdgpu: split umc callbacks to ras and non-ras ones umc ras is not managed by gpu driver when gpu is connected to cpu through xgmi. split umc callbacks into ras and non-ras ones so gpu driver only initializes umc ras callbacks when it manages umc ras. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Dennis Li <Dennis.Li@amd.com> Reviewed-by: John Clements <John.Clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-04-09 16:51:11 -04:00
Sebastian Andrzej Siewior	58df0d7143	drm/amdgpu: Replace in_interrupt() usage in gmc_v_process_interrupt() The usage of in_interrupt() in gmc_v_process_interrupt() is intended to use a different code path if invoked from the interrupt handler vs invoked from the workqueue. The usage of in_interrupt() in drivers is phased out and Linus clearly requested that code which changes behaviour depending on context should either be separated or the context be conveyed in an argument passed by the caller, which usually knows the context. gmc_v_process_interrupt() is invoked via the ->process() callback from amdgpu_ih_process() which in turn is invoked either from amdgpu_irq_handler() (the interrupt handler) or from amdgpu_irq_handle_() which is a workqueue. amdgpu_irq::ih is always processed from the interrupt handler, the other three struct amdgpu_ih_ring members are processed from a workqueue. Replace the in_interrupt() check with a comparison against adev->irq.ih. A similar check is already done to check if the ih pointer is from ih_soft. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-03-23 23:27:58 -04:00
Yong Zhao	be14729a33	drm/amdgpu: Print the IH client ID name when vm fault happens This gives more information and improves productivity. Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-03-23 22:51:40 -04:00
Likun Gao	bf087285dc	drm/amdgpu: switch hdp callback functions for hdp v5 Switch to use the HDP functions which unified on hdp structure instead of the scattered hdp callback functions. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2021-01-05 11:33:08 -05:00
Alex Deucher	64f2c15892	drm/amdgpu: remove amdgpu_ttm_late_init and amdgpu_bo_late_init No longer used. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-12-08 23:06:33 -05:00
Nirmoy Das	68fce5f07c	drm/amdgpu: use AMDGPU_NUM_VMID when possible Replace hardcoded vmid number with AMDGPU_NUM_VMID macro. Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-12-08 23:05:40 -05:00
Lee Jones	185ef9ef2f	drm/amd/amdgpu/gmc_v10_0: Suppy some missing function doc descriptions Fixes the following W=1 kernel build warning(s): drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c:278: warning: Function parameter or member 'vmhub' not described in 'gmc_v10_0_flush_gpu_tlb' drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c:278: warning: Function parameter or member 'flush_type' not described in 'gmc_v10_0_flush_gpu_tlb' drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c:371: warning: Function parameter or member 'flush_type' not described in 'gmc_v10_0_flush_gpu_tlb_pasid' drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c:371: warning: Function parameter or member 'all_hub' not described in 'gmc_v10_0_flush_gpu_tlb_pasid' Cc: Alex Deucher <alexander.deucher@amd.com> Cc: "Christian König" <christian.koenig@amd.com> Cc: David Airlie <airlied@linux.ie> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: amd-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Cc: linux-media@vger.kernel.org Cc: linaro-mm-sig@lists.linaro.org Signed-off-by: Lee Jones <lee.jones@linaro.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-12-01 16:04:41 -05:00
Alex Deucher	99698b51e5	drm/amdgpu: enable AGP aperture on gmc10.x (v2) Just a small optimization for accessing system pages directly. Was missed for gmc v10 since the feature landed for older gmcs while we were still on the emulator or gmc10 and we use the AGP aperture for zfb on the emulator. v2: fix up the system aperture as well Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-12-01 16:04:28 -05:00
Christian König	a2a8857cee	drm/amdgpu: implement retry fault handling for Navi Same as gmc9, basically filter the fault, reroute or handle it. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-11-24 12:07:01 -05:00
Christian König	45d87b85d3	drm/amdgpu: cleanup gmc_v10_0_process_interrupt a bit Return early in case of a ratelimit and don't print leading zeros for the address. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-11-24 12:06:32 -05:00
Deepak R Varma	94ba290da1	drm/amdgpu: improve code indentation and alignment General code indentation and alignment changes such as replace spaces by tabs or align function arguments as per the coding style guidelines. The patch covers various .c files for this driver. Issue reported by checkpatch script. Signed-off-by: Deepak R Varma <mh12gx2825@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-11-02 15:34:31 -05:00
Likun Gao	4005809bb1	drm/amdgpu: add support to configure MALL for sienna_cichlid (v2) Enable Memory Access at Last Level (MALL) feature for sienna_cichlid. v2: drop module option. We need to add UAPI so userspace can request MALL per buffer. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-10-23 15:33:48 -04:00
Alex Deucher	5c46c49276	drm/amdgpu/gmc10: remove dummy read workaround for newer chips Sienna Cichlid and newer have a hw fix so no longer require the workaround. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-10-21 16:14:28 -04:00
Tao Zhou	f267242e15	drm/amdgpu: add gmc cg support for dimgrey_cavefish The athub version for dimgrey_cavefish is v2.1. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-10-12 14:00:57 -04:00
Tao Zhou	3e02ad4476	drm/amdgpu: add gmc ip block for dimgrey_cavefish Enable gmc block for dimgrey_cavefish, same as sienna_cichlid. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-10-12 14:00:29 -04:00
Tao Zhou	a14354690f	drm/amdgpu: add gmc support for dimgrey_cavefish Same as navy_flounder. Signed-off-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Jiansong Chen <Jiansong.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-10-12 14:00:13 -04:00
Huang Rui	4d8d75a45c	drm/amdgpu: add mmhub v2.3 for vangogh (v4) There are too many register offset mismatch between mmhub v2.0 and v2.3. E.X: mmMM_ATC_L2_MISC_CG: 0x064a(v2.0) 0x06cd(v2.3) mmMMVM_L2_PROTECTION_FAULT_CNTL: 0x0688(v2.0) 0x0708(v2.3) mmMMVM_CONTEXT0_PAGE_TABLE_BASE_ADDR_LO32: 0x072b(v2.0) 0x0940(v2.3) mmMMVM_CONTEXT0_PAGE_TABLE_BASE_ADDR_HI32: 0x072c(v2.0) 0x0941(v2.3) mmMMVM_INVALIDATE_ENG0_REQ: 0x06e3(v2.0) 0x0a01(v2.3) mmMMVM_INVALIDATE_ENG0_ACK: 0x06f5(v2.0) 0x0a02(v2.3) mmMMVM_CONTEXT0_CNTL: 0x06c0(v2.0) 0x0740(v2.3) mmMMVM_L2_PROTECTION_FAULT_STATUS: 0x068c(v2.0) 0x070c(v2.3) mmMMVM_L2_PROTECTION_FAULT_CNTL: 0x0688(v2.0) 0x0708(v2.3) mmMM_ATC_L2_MISC_CG: 0x064a(v2.0) 0x06cd(v2.3) mmDAGB0_CNTL_MISC2: 0x0071(v2.0) 0x0096(v2.3) ... Continuing using the same file mmhub v2.0 is not good choice, it will introduce a lot of checking with ASIC types. And also easy to introduce the issues that offset not align, this kind of issues are really hard to find. Van Gogh's mmhub vm invalidation is actually caused by the offset mismatch as well. So it would like to create a new file rather than stick to re-use orignal mmhub v2.0 here. v2: add missed translate_further programming. v3: sync with latest code v4: add missing callbacks Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-10-05 15:15:27 -04:00
Huang Rui	6405e627a0	drm/amdgpu: add gmc v10 supports for van gogh (v4) Add gfx memory controller support for van gogh. v2: don't use dynamic invalidate eng allocation for van gogh. v3: squash in other fixes v4: rebase Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-10-05 15:14:03 -04:00
Oak Zeng	8ffff9b449	drm/amdgpu: use function pointer for gfxhub functions gfxhub functions are now called from function pointers, instead of from asic-specific functions. Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-30 13:50:13 -04:00
Dennis Li	81202807ae	drm/amdgpu: block ring buffer access during GPU recovery When GPU is in reset, its status isn't stable and ring buffer also need be reset when resuming. Therefore driver should protect GPU recovery thread from ring buffer accessed by other threads. Otherwise GPU will randomly hang during recovery. v2: correct indent Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-09-03 14:46:55 -04:00
Dennis Li	aac891685d	drm/amdgpu: refine message print for devices of hive Using dev_xxx instead of DRM_xxx/pr_xxx to indicate which device of a hive is the message for. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:24:06 -04:00
Dennis Li	53b3f8f40e	drm/amdgpu: refine codes to avoid reentering GPU recovery if other threads have holden the reset lock, recovery will fail to try_lock. Therefore we introduce atomic hive->in_reset and adev->in_gpu_reset, to avoid reentering GPU recovery. v2: drop "? true : false" in the definition of amdgpu_in_reset Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 12:22:56 -04:00
Christian König	f1403342eb	drm/amdgpu: revert "fix system hang issue during GPU reset" The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Oak Zeng	9fb1506eb6	drm/amdgpu: Use function pointer for some mmhub functions Add more function pointers to amdgpu_mmhub_funcs. ASIC specific implementation of most mmhub functions are called from a general function pointer, instead of calling different function for different ASIC. Simplify the code by deleting duplicate functions Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Alex Deucher	7348c20a4e	drm/amdgpu/gmc10: switch to using amdgpu_gmc_get_vbios_allocations The new helper centralizes the logic in one place. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:29 -04:00
Alex Deucher	fcbc92e2e1	drm/amdgpu: move stolen vga bo from amdgpu to amdgpu.gmc Since that is where we store the other data related to the stolen vga memory. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
Alex Deucher	81b54fb7a2	drm/amdgpu: use a define for the memory size of the vga emulator Rather than open coding it everywhere. Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:28 -04:00
John Clements	0ad7a64d69	drm/amdgpu: enable RAS support for sienna cichlid enabled GECC error injection and query support Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-04 17:29:08 -04:00
John Clements	01eee24fce	drm/amdgpu: enable umc 8.7 functions in gmc v10 add support for umc 8.7 initialization add umc 8.7 source to makefile Reviewed-by: Guchun Chen <guchun.chen@amd.com> Signed-off-by: John Clements <john.clements@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:33 -04:00
Dennis Li	df9c8d1aa2	drm/amdgpu: fix system hang issue during GPU reset when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com> Suggested-by: Luben Tukov <luben.tuikov@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:37 -04:00
Huang Rui	db92fbc3d7	drm/amdgpu: won't include gc and mmhub register headers in GMC block All gc/mmhub register access and operation should be in gfxhub/mmhub level. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:43:10 -04:00
Huang Rui	caa9f483ca	drm/amdgpu: move get_invalidate_req function into gfxhub/mmhub level This patch is to move get_invalidate_req into gfxhub/mmhub level. It will avoid mismatch of the different gfxhub/mmhub register offsets and fields in the same gmc block. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:43:04 -04:00
Huang Rui	2577db91e8	drm/amdgpu: add vmhub funcs helper (v2) This patch is to introduce vmhub funcs helper to add following callback (print_l2_protection_fault_status). Each GC/MMHUB register specific programming should be in gfxhub/mmhub level. v2: remove the condition of funcs assignment. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:42:56 -04:00
Huang Rui	f2c1b5c145	drm/amdgpu: abstract set_vm_fault_masks function to refine the programming This patch is to add set_vm_fault_masks helper to amdgpu_gmc to refine the original programming. Signed-off-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-22 18:42:49 -04:00
Jiansong Chen	922783755b	drm/amdgpu: add gmc cg support for navy_flounder The athub version used for navy_flounder is v2.1. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-15 12:46:38 -04:00
Jiansong Chen	fc8f07da1f	drm/amdgpu: add gmc ip block for navy_flounder navy_flounder has similar gc IP version with sienna_cichlid, follow its setting for the moment. Signed-off-by: Jiansong Chen <Jiansong.Chen@amd.com> Reviewed-by: Tao Zhou <Tao.Zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-15 12:46:14 -04:00

1 2

98 Commits