linux

iv/linux

Author	SHA1	Message	Date
Alex Deucher	cf851f3ff8	drm/amdgpu: Fix buffer overflow in INFO ioctl The values for "se_num" and "sh_num" come from the user in the ioctl. They can be in the 0-255 range but if they're more than AMDGPU_GFX_MAX_SE (4) or AMDGPU_GFX_MAX_SH_PER_SE (2) then it results in an out of bounds read. Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:19 -04:00
Luben Tuikov	8aba21b751	drm/amdgpu: Embed drm_device into amdgpu_device (v3) a) Embed struct drm_device into struct amdgpu_device. b) Modify the inline-f drm_to_adev() accordingly. c) Modify the inline-f adev_to_drm() accordingly. d) Eliminate the use of drm_device.dev_private, in amdgpu. e) Switch from using drm_dev_alloc() to drm_dev_init(). f) Add a DRM driver release function, which frees the container amdgpu_device after all krefs on the contained drm_device have been released. v2: Split out adding adev_to_drm() into its own patch (previous commit), making this patch more succinct and clear. More detailed commit description. v3: squash in fix to call drmm_add_final_kfree() to avoid a warning. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-26 16:40:17 -04:00
Luben Tuikov	1348969ab6	drm/amdgpu: drm_device to amdgpu_device by inline-f (v2) Get the amdgpu_device from the DRM device by use of an inline function, drm_to_adev(). The inline function resolves a pointer to struct drm_device to a pointer to struct amdgpu_device. v2: Use a typed visible static inline function instead of an invisible macro. Signed-off-by: Luben Tuikov <luben.tuikov@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-24 13:06:06 -04:00
Christian König	f1403342eb	drm/amdgpu: revert "fix system hang issue during GPU reset" The whole approach wasn't thought through till the end. We already had a reset lock like this in the past and it caused the same problems like this one. Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary. This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-14 16:22:40 -04:00
Pierre-Eric Pelloux-Prayer	16c642ec3f	drm/amdgpu: new ids flag for tmz (v2) Allows UMD to know if TMZ is supported and enabled. This commit also bumps KMS_DRIVER_MINOR because if we don't UMD can't tell if "ids_flags & AMDGPU_IDS_FLAGS_TMZ == 0" means "tmz is not enabled" or "tmz may be enabled but the kernel doesn't report it". v2: use amdgpu_is_tmz() and reworded commit message. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-08-06 15:45:52 -04:00
Peilin Ye	8e32628592	drm/amdgpu: Prevent kernel-infoleak in amdgpu_info_ioctl() Compiler leaves a 4-byte hole near the end of `dev_info`, causing amdgpu_info_ioctl() to copy uninitialized kernel stack memory to userspace when `size` is greater than 356. In 2015 we tried to fix this issue by doing `= {};` on `dev_info`, which unfortunately does not initialize that 4-byte hole. Fix it by using memset() instead. Cc: stable@vger.kernel.org Fixes: c193fa91b918 ("drm/amdgpu: information leak in amdgpu_info_ioctl()") Fixes: d38ceaf99ed0 ("drm/amdgpu: add core driver (v4)") Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-30 15:36:44 -04:00
Dennis Li	df9c8d1aa2	drm/amdgpu: fix system hang issue during GPU reset when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover, the atomic adev->in_gpu_reset and hive->in_reset are used to avoid re-entering GPU recovery. During GPU reset and resume, it is unsafe that other threads access GPU, which maybe cause GPU reset failed. Therefore the new rw_semaphore adev->reset_sem is introduced, which protect GPU from being accessed by external threads during recovery. v2: 1. add rwlock for some ioctls, debugfs and file-close function. 2. change to use dqm->is_resetting and dqm_lock for protection in kfd driver. 3. remove try_lock and change adev->in_gpu_reset as atomic, to avoid re-enter GPU recovery for the same GPU hang. v3: 1. change back to use adev->reset_sem to protect kfd callback functions, because dqm_lock couldn't protect all codes, for example: free_mqd must be called outside of dqm_lock; [ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019 [ 1230.177221] Call Trace: [ 1230.178249] dump_stack+0x98/0xd5 [ 1230.179443] amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu] [ 1230.180673] gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu] [ 1230.181882] amdgpu_gart_unbind+0xa9/0xe0 [amdgpu] [ 1230.183098] amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu] [ 1230.184239] ? ttm_bo_put+0x171/0x5f0 [ttm] [ 1230.185394] ttm_tt_unbind+0x21/0x40 [ttm] [ 1230.186558] ttm_tt_destroy.part.12+0x12/0x60 [ttm] [ 1230.187707] ttm_tt_destroy+0x13/0x20 [ttm] [ 1230.188832] ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm] [ 1230.189979] ttm_bo_put+0x1be/0x5f0 [ttm] [ 1230.191230] amdgpu_bo_unref+0x1e/0x30 [amdgpu] [ 1230.192522] amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu] [ 1230.193833] free_mqd+0x25/0x40 [amdgpu] [ 1230.195143] destroy_queue_cpsch+0x1a7/0x270 [amdgpu] [ 1230.196475] pqm_destroy_queue+0x105/0x260 [amdgpu] [ 1230.197819] kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu] [ 1230.199154] kfd_ioctl+0x277/0x500 [amdgpu] [ 1230.200458] ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu] [ 1230.201656] ? tomoyo_file_ioctl+0x19/0x20 [ 1230.202831] ksys_ioctl+0x98/0xb0 [ 1230.204004] __x64_sys_ioctl+0x1a/0x20 [ 1230.205174] do_syscall_64+0x5f/0x250 [ 1230.206339] entry_SYSCALL_64_after_hwframe+0x49/0xbe 2. remove try_lock and introduce atomic hive->in_reset, to avoid re-enter GPU recovery. v4: 1. remove an unnecessary whitespace change in kfd_chardev.c 2. remove comment codes in amdgpu_device.c 3. add more detailed comment in commit message 4. define a wrap function amdgpu_in_reset v5: 1. Fix some style issues. Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Suggested-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <Felix.Kuehling@amd.com> Suggested-by: Lijo Lazar <Lijo.Lazar@amd.com> Suggested-by: Luben Tukov <luben.tuikov@amd.com> Signed-off-by: Dennis Li <Dennis.Li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-27 16:21:37 -04:00
Dave Airlie	9555152beb	Merge tag 'amd-drm-next-5.9-2020-07-01' of git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.9-2020-07-01: amdgpu: - DC DMUB updates - HDCP fixes - Thermal interrupt fixes - Add initial support for Sienna Cichlid GPU - Add support for unique id on Arcturus - Major swSMU code cleanup - Skip BAR resizing if the bios already did id - Fixes for DCN bandwidth calculations - Runtime PM reference count fixes - Add initial UVD support for SI - Add support for ASSR on eDP links - Lots of misc fixes and cleanups - Enable runtime PM on vega10 boards that support BACO - RAS fixes - SR-IOV fixes - Use IP discovery table on renoir - DC stream synchronization fixes amdkfd: - Track SDMA usage per process - Fix GCC10 compiler warnings - Locking fix radeon: - Default to on chip GART for AGP boards on all arches - Runtime PM reference count fixes UAPI: - Update comments to clarify MTYPE From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200701155041.1102829-1-alexander.deucher@amd.com Signed-off-by: Dave Airlie <airlied@redhat.com>	2020-07-02 15:17:31 +10:00
Alex Deucher	cd5277809b	drm/amdgpu: enable runtime pm on vega10 when noretry=0 The failures with ROCm only happen with noretry=1, so enable runtime pm when noretry=0 (the current default). Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:27 -04:00
Alex Deucher	b38c69688f	drm/amdgpu: rework runtime pm enablement for BACO Add a switch statement to simplify asic checks. Note that BACO is not supported on APUs, so there is no need to check them. Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:27 -04:00
Navid Emamdoost	9ba8923cbb	drm/amdgpu: fix ref count leak in amdgpu_driver_open_kms in amdgpu_driver_open_kms the call to pm_runtime_get_sync increments the counter even in case of failure, leading to incorrect ref count. In case of failure, decrement the ref count before returning. Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:22 -04:00
Likun Gao	9b76e06113	drm/amdgpu: disable runtime pm for sienna_cichlid temporarily Disable runtime pm for sienna_cichlid temporarily as BACO regression issue. Signed-off-by: Likun Gao <Likun.Gao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-07-01 01:59:13 -04:00
Linus Torvalds	faa392181a	drm pull for 5.8-rc1 core: - uapi: error out EBUSY when existing master - uapi: rework SET/DROP MASTER permission handling - remove drm_pci.h - drm_pci* are now legacy - introduced managed DRM resources - subclassing support for drm_framebuffer - simple encoder helper - edid improvements - vblank + writeback documentation improved - drm/mm - optimise tree searches - port drivers to use devm_drm_dev_alloc dma-buf: - add flag for p2p buffer support mst: - ACT timeout improvements - remove drm_dp_mst_has_audio - don't use 2nd TX slot - spec recommends against it bridge: - dw-hdmi various improvements - chrontel ch7033 support - fix stack issues with old gcc hdmi: - add unpack function for drm infoframe fbdev: - misc fbdev driver fixes i915: - uapi: global sseu pinning - uapi: OA buffer polling - uapi: remove generated perf code - uapi: per-engine default property values in sysfs - Tigerlake GEN12 enabled. - Lots of gem refactoring - Tigerlake enablement patches - move to drm_device logging - Icelake gamma HW readout - push MST link retrain to hotplug work - bandwidth atomic helpers - ICL fixes - RPS/GT refactoring - Cherryview full-ppgtt support - i915 locking guidelines documented - require linear fb stride to be 512 multiple on gen9 - Tigerlake SAGV support amdgpu: - uapi: encrypted GPU memory handling - uapi: add MEM_SYNC IB flag - p2p dma-buf support - export VRAM dma-bufs - FRU chip access support - RAS/SR-IOV updates - Powerplay locking fixes - VCN DPG (powergating) enablement - GFX10 clockgating fixes - DC fixes - GPU reset fixes - navi SDMA fix - expose FP16 for modesetting - DP 1.4 compliance fixes - gfx10 soft recovery - Improved Critical Thermal Faults handling - resizable BAR on gmc10 amdkfd: - uapi: GWS resource management - track GPU memory per process - report PCI domain in topology radeon: - safe reg list generator fixes nouveau: - HD audio fixes on recent systems - vGPU detection (fail probe if we're on one, for now) - Interlaced mode fixes (mostly avoidance on Turing, which doesn't support it) - SVM improvements/fixes - NVIDIA format modifier support - Misc other fixes. adv7511: - HDMI SPDIF support ast: - allocate crtc state size - fix double assignment - fix suspend bochs: - drop connector register cirrus: - move to tiny drivers. exynos: - fix imported dma-buf mapping - enable runtime PM - fixes and cleanups mediatek: - DPI pin mode swap - config mipi_tx current/impedance lima: - devfreq + cooling device support - task handling improvements - runtime PM support pl111: - vexpress init improvements - fix module auto-load rcar-du: - DT bindings conversion to YAML - Planes zpos sanity check and fix - MAINTAINERS entry for LVDS panel driver mcde: - fix return value mgag200: - use managed config init stm: - read endpoints from DT vboxvideo: - use PCI managed functions - drop WC mtrr vkms: - enable cursor by default rockchip: - afbc support virtio: - various cleanups qxl: - fix cursor notify port hisilicon: - 128-byte stride alignment fix sun4i: - improved format handling -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJe1edsAAoJEAx081l5xIa+bKEQAJAZv/8OMM2rx+p+GyKgrNpl ihTX/oyToy8dw97s1kWF7V5kKU+qjF8aWlKoPS0xovzaMAzYSFz9FRNEUgqtTXMI zIAzSXioqP21oL9/ZTHcXDULtz8Gk3uiPomgXMWLlNBdt3X5qvCwsmPRIYSwG0GJ 00VCvxDbVxGSM3wzcvbfyRwHCq3SrFvIusXv5jHnnxEFGH0C7Mj2/FLYMKLNjvli Q8VEI2wQPZj1QdA8fLFVneIQsR6YUSko9OfFMANP8VJGpPMnUkvVxTJ5ACGJspvn U/h6NYqJeUU2Y3BSKqtjIC3a1LY51tp5tL9q4H9TD1hqMckt6F2V7T2IeFU8i6+V YzUsSiT4q1xB+uiFVcgopx2hyIp8INOEyWrVdYgw2JviROeRD+pDHvJd13ZNMnTe GvLWQ/PfBFrcz8eligjiYjOf66ZTU+j/rivaOBFyrs9gdlsaEW2QRurFrcNX+0lZ kDbLsIFjhYnPXsvHP87x4BuQCKQIEh8wWuxXuJjunBPdqVrJyltZWbBiKO571b5/ BtX6xj6ztUOffR2RdiVanzY546I2hEi7SHMUuWnMqXsOV46GBN0QvlpZad/47n9x ZUy8HDDD0/qWuGwvPOJGIeAnUteWge9AhWXTeN5+1h5m+QEOzYkPKqC3Hp8TW1pM gToTWgAhnu731fhzLWyt =H7IS -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm Pull drm updates from Dave Airlie: "Highlights: - Core DRM had a lot of refactoring around managed drm resources to make drivers simpler. - Intel Tigerlake support is on by default - amdgpu now support p2p PCI buffer sharing and encrypted GPU memory Details: core: - uapi: error out EBUSY when existing master - uapi: rework SET/DROP MASTER permission handling - remove drm_pci.h - drm_pci* are now legacy - introduced managed DRM resources - subclassing support for drm_framebuffer - simple encoder helper - edid improvements - vblank + writeback documentation improved - drm/mm - optimise tree searches - port drivers to use devm_drm_dev_alloc dma-buf: - add flag for p2p buffer support mst: - ACT timeout improvements - remove drm_dp_mst_has_audio - don't use 2nd TX slot - spec recommends against it bridge: - dw-hdmi various improvements - chrontel ch7033 support - fix stack issues with old gcc hdmi: - add unpack function for drm infoframe fbdev: - misc fbdev driver fixes i915: - uapi: global sseu pinning - uapi: OA buffer polling - uapi: remove generated perf code - uapi: per-engine default property values in sysfs - Tigerlake GEN12 enabled. - Lots of gem refactoring - Tigerlake enablement patches - move to drm_device logging - Icelake gamma HW readout - push MST link retrain to hotplug work - bandwidth atomic helpers - ICL fixes - RPS/GT refactoring - Cherryview full-ppgtt support - i915 locking guidelines documented - require linear fb stride to be 512 multiple on gen9 - Tigerlake SAGV support amdgpu: - uapi: encrypted GPU memory handling - uapi: add MEM_SYNC IB flag - p2p dma-buf support - export VRAM dma-bufs - FRU chip access support - RAS/SR-IOV updates - Powerplay locking fixes - VCN DPG (powergating) enablement - GFX10 clockgating fixes - DC fixes - GPU reset fixes - navi SDMA fix - expose FP16 for modesetting - DP 1.4 compliance fixes - gfx10 soft recovery - Improved Critical Thermal Faults handling - resizable BAR on gmc10 amdkfd: - uapi: GWS resource management - track GPU memory per process - report PCI domain in topology radeon: - safe reg list generator fixes nouveau: - HD audio fixes on recent systems - vGPU detection (fail probe if we're on one, for now) - Interlaced mode fixes (mostly avoidance on Turing, which doesn't support it) - SVM improvements/fixes - NVIDIA format modifier support - Misc other fixes. adv7511: - HDMI SPDIF support ast: - allocate crtc state size - fix double assignment - fix suspend bochs: - drop connector register cirrus: - move to tiny drivers. exynos: - fix imported dma-buf mapping - enable runtime PM - fixes and cleanups mediatek: - DPI pin mode swap - config mipi_tx current/impedance lima: - devfreq + cooling device support - task handling improvements - runtime PM support pl111: - vexpress init improvements - fix module auto-load rcar-du: - DT bindings conversion to YAML - Planes zpos sanity check and fix - MAINTAINERS entry for LVDS panel driver mcde: - fix return value mgag200: - use managed config init stm: - read endpoints from DT vboxvideo: - use PCI managed functions - drop WC mtrr vkms: - enable cursor by default rockchip: - afbc support virtio: - various cleanups qxl: - fix cursor notify port hisilicon: - 128-byte stride alignment fix sun4i: - improved format handling" * tag 'drm-next-2020-06-02' of git://anongit.freedesktop.org/drm/drm: (1401 commits) drm/amd/display: Fix potential integer wraparound resulting in a hang drm/amd/display: drop cursor position check in atomic test drm/amdgpu: fix device attribute node create failed with multi gpu drm/nouveau: use correct conflicting framebuffer API drm/vblank: Fix -Wformat compile warnings on some arches drm/amdgpu: Sync with VM root BO when switching VM to CPU update mode drm/amd/display: Handle GPU reset for DC block drm/amdgpu: add apu flags (v2) drm/amd/powerpay: Disable gfxoff when setting manual mode on picasso and raven drm/amdgpu: fix pm sysfs node handling (v2) drm/amdgpu: move gpu_info parsing after common early init drm/amdgpu: move discovery gfx config fetching drm/nouveau/dispnv50: fix runtime pm imbalance on error drm/nouveau: fix runtime pm imbalance on error drm/nouveau: fix runtime pm imbalance on error drm/nouveau/debugfs: fix runtime pm imbalance on error drm/nouveau/nouveau/hmm: fix migrate zero page to GPU drm/nouveau/nouveau/hmm: fix nouveau_dmem_chunk allocations drm/nouveau/kms/nv50-: Share DP SST mode_valid() handling with MST drm/nouveau/kms/nv50-: Move 8BPC limit for MST into nv50_mstc_get_modes() ...	2020-06-02 15:04:15 -07:00
Alex Deucher	d7aca4f0b2	drm/amdgpu: simplify mec2 fw check Check if mec2 fw exists rather than checking asic types. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-28 14:00:49 -04:00
Alex Deucher	b586154466	drm/amdgpu: only set DPM_FLAG_NEVER_SKIP for legacy ATPX BOCO We only need to set DPM_FLAG_NEVER_SKIP for the legacy ATPX BOCO case. D3cold and BACO work as expected. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-08 14:33:32 -04:00
Alex Deucher	f0d6967808	drm/amdgpu: drop pm_runtime_set_active The pci core handles this for us in pci_pm_init. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-05-08 14:33:21 -04:00
Rafael J. Wysocki	e07515563d	PM: sleep: core: Rename DPM_FLAG_NEVER_SKIP Rename DPM_FLAG_NEVER_SKIP to DPM_FLAG_NO_DIRECT_COMPLETE which matches its purpose more closely. No functional impact. Suggested-by: Alan Stern <stern@rowland.harvard.edu> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # for PCI parts Acked-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com> Acked-by: Alan Stern <stern@rowland.harvard.edu> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-24 21:33:09 +02:00
Aurabindo Pillai	ad36d71b3f	amdgpu_kms: Remove unnecessary condition check Execution will only reach here if the asserted condition is true. Hence there is no need for the additional check. Signed-off-by: Aurabindo Pillai <mail@aurabindo.in> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-04-09 10:43:18 -04:00
Feifei Xu	5d11e37c02	drm/amdgpu/runpm: disable runpm on Vega10 Some framework test will fail if enable runpm on Vega10. Disable it untill issue fixed. Signed-off-by: Feifei Xu <Feifei.Xu@amd.com> Tested-by: Kyle Chen <Kyle.Chen@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-10 15:55:18 -04:00
Monk Liu	752c683dbb	drm/amdgpu: fix IB test MCBP bug 1)for gfx IB test we shouldn't insert DE meta data 2)we should make sure IB test finished before we send event 3 to hypervisor otherwise the IDLE from event 3 will preempt IB test, which is not designed as a compatible structure for MCBP Signed-off-by: Monk Liu <Monk.Liu@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-03-05 00:28:11 -05:00
Dave Airlie	a2ae604da7	Merge tag 'amd-drm-next-5.7-2020-02-26' of git://people.freedesktop.org/~agd5f/linux into drm-next amd-drm-next-5.7-2020-02-26: amdgpu: - Rework VM update handling in preparation for HMM support - HDCP srm support - PSR fixes - DC watermark fixes - OLED panel support - SR-IOV fixes - BACO fixes - Optimize debugging vram access - RAS fixes - Use BACO for runtime pm - HDCP fixes - XGMI fixes - DDC fixes - DC clock programming optimizations and fixes - PSP fw loading sequence updates - Drop DRIVER_USE_AGP - Remove legacy drm load and unload callbacks amdkfd: - Add runtime pm support radeon: - Drop DRIVER_USE_AGP Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200227043142.4075-1-alexander.deucher@amd.com	2020-02-28 15:40:26 +10:00
Thomas Zimmermann	e3eff4b5d9	drm/amdgpu: Convert to CRTC VBLANK callbacks VBLANK callbacks in struct drm_driver are deprecated in favor of their equivalents in struct drm_crtc_funcs. Convert amdgpu over. v2: * don't wrap existing functions; change signature instead Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200123135943.24140-6-tzimmermann@suse.de	2020-02-13 13:08:13 +01:00
Alex Deucher	4fdda2e66d	drm/amdgpu/runpm: enable runpm on baco capable VI+ asics Seems to work reliably on VI+ except for a few so enable runpm barring those where baco for runtime power management is not supported. [rajneesh] Picked https://patchwork.freedesktop.org/patch/335402/ to enable runtime pm with baco for kfd. Also fixed a checkpatch warning and added extra checks for VEGA20 and ARCTURUS. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:01:03 -05:00
Yintian Tao	c9ffa427db	drm/amd/powerplay: enable pp one vf mode for vega10 Originally, due to the restriction from PSP and SMU, VF has to send message to hypervisor driver to handle powerplay change which is complicated and redundant. Currently, SMU and PSP can support VF to directly handle powerplay change by itself. Therefore, the old code about the handshake between VF and PF to handle powerplay will be removed and VF will use new the registers below to handshake with SMU. mmMP1_SMN_C2PMSG_101: register to handle SMU message mmMP1_SMN_C2PMSG_102: register to handle SMU parameter mmMP1_SMN_C2PMSG_103: register to handle SMU response v2: remove module parameter pp_one_vf v3: fix the parens v4: forbid vf to change smu feature v5: use hwmon_attributes_visible to skip sepicified hwmon atrribute v6: change skip condition at vega10_copy_table_to_smc Signed-off-by: Yintian Tao <yttao@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Kenneth Feng <kenneth.feng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-12-11 15:22:07 -05:00
Alex Deucher	72f058b723	drm/amdgpu: enable runtime pm on BACO capable boards if runpm=1 BACO - Bus Active, Chip Off Everything is in place now. Not enabled by default yet. You still have to specify runpm=1. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:54 -05:00
Alex Deucher	6ae6c7d404	drm/amdgpu: start to disentangle boco from runtime pm BACO - Bus Active, Chip Off BOCO - Bus Off, Chip Off We originally only supported runtime pm on PX/HG laptops so most of the runtime pm code looks for this. Add a new flag to check for runtime pm enablement and use this rather than checking for PX/HG. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:53 -05:00
Alex Deucher	31af062acf	drm/amdgpu: rename amdgpu_device_is_px to amdgpu_device_supports_boco (v2) BACO - Bus Active, Chip Off BOCO - Bus Off, Chip Off To better match what we are checking for and to align with amdgpu_device_supports_baco. BOCO is used on PowerXpress/Hybrid Graphics systems and BACO is used on desktop dGPU boards. v2: fix typo in documentation Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 16:42:50 -05:00
Alex Deucher	ca9317b918	drm/amdgpu: disable gfxoff when using register read interface When gfxoff is enabled, accessing gfx registers via MMIO can lead to a hang. Bug: https://bugzilla.kernel.org/show_bug.cgi?id=205497 Acked-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:54 -05:00
Leo Liu	52f2e779ad	drm/amdgpu: add driver support for JPEG2.0 and above By using JPEG IP block type Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:50 -05:00
Leo Liu	0388aee766	drm/amdgpu: use the JPEG structure for general driver support JPEG1.0 will be functional along with VCN1.0 Signed-off-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-19 10:12:49 -05:00
Nicholas Kazlauskas	976e51a7c0	drm/amdgpu: Add DMCUB to firmware query interface The DMCUB firmware version can be read using the AMDGPU_INFO ioctl or the amdgpu_firmware_info debugfs entry. Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Reviewed-by: Hersen Wu <hersenxs.wu@amd.com> Acked-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-13 15:29:42 -05:00
Evan Quan	b0adca4d50	drm/amdgpu: register gpu instance before fan boost feature enablment Otherwise, the feature enablement will be skipped due to wrong count. Fixes: beff74bc6e0fa91 ("drm/amdgpu: fix a race in GPU reset with IB test (v2)") Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-11-06 16:27:47 -05:00
Dave Airlie	3275a71e76	Merge tag 'drm-next-5.5-2019-10-09' of git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.5-2019-10-09: amdgpu: - Additional RAS enablement for vega20 - RAS page retirement and bad page storage in EEPROM - No GPU reset with unrecoverable RAS errors - Reserve vram for page tables rather than trying to evict - Fix issues with GPU reset and xgmi hives - DC i2c over aux fixes - Direct submission for clears, PTE/PDE updates - Improvements to help support recoverable GPU page faults - Silence harmless SAD block messages - Clean up code for creating a bo at a fixed location - Initial DC HDCP support - Lots of documentation fixes - GPU reset for renoir - Add IH clockgating support for soc15 asics - Powerplay improvements - DC MST cleanups - Add support for MSI-X - Misc cleanups and bug fixes amdkfd: - Query KFD device info by asic type rather than pci ids - Add navi14 support - Add renoir support - Add navi12 support - gfx10 trap handler improvements - pasid cleanups - Check against device cgroup ttm: - Return -EBUSY with pipelining with no_gpu_wait radeon: - Silence harmless SAD block messages device_cgroup: - Export devcgroup_check_permission Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20191010041713.3412-1-alexander.deucher@amd.com	2019-10-26 05:56:57 +10:00
Hans de Goede	984d7a929a	drm/amdgpu: Bail earlier when amdgpu.cik_/si_support is not set to 1 Bail from the pci_driver probe function instead of from the drm_driver load function. This avoid /dev/dri/card0 temporarily getting registered and then unregistered again, sending unwanted add / remove udev events to userspace. Specifically this avoids triggering the (userspace) bug fixed by this plymouth merge-request: https://gitlab.freedesktop.org/plymouth/plymouth/merge_requests/59 Note that despite that being a userspace bug, not sending unnecessary udev events is a good idea in general. BugLink: https://bugzilla.redhat.com/show_bug.cgi?id=1490490 Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2019-10-11 21:31:14 -05:00
Marek Olšák	cf21e76a60	drm/amdgpu: return tcc_disabled_mask to userspace UMDs need this for correct programming of harvested chips. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:05 -05:00
Marek Olšák	6de088a08d	drm/amdgpu: remove gfx9 NGG Never used. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-03 09:11:01 -05:00
Marek Olšák	815fb4c9d7	drm/amdgpu: return tcc_disabled_mask to userspace UMDs need this for correct programming of harvested chips. Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-10-02 12:58:33 -05:00
Trek	73d8e6c7b8	drm/amdgpu: Check for valid number of registers to read Do not try to allocate any amount of memory requested by the user. Instead limit it to 128 registers. Actually the longest series of consecutive allowed registers are 48, mmGB_TILE_MODE0-31 and mmGB_MACROTILE_MODE0-15 (0x2644-0x2673). Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111273 Signed-off-by: Trek <trek00@inbox.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-17 14:37:45 -05:00
Trek	13238d4fa6	drm/amdgpu: Check for valid number of registers to read Do not try to allocate any amount of memory requested by the user. Instead limit it to 128 registers. Actually the longest series of consecutive allowed registers are 48, mmGB_TILE_MODE0-31 and mmGB_MACROTILE_MODE0-15 (0x2644-0x2673). Bug: https://bugs.freedesktop.org/show_bug.cgi?id=111273 Signed-off-by: Trek <trek00@inbox.ru> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-16 10:42:56 -05:00
Andrey Grodzovsky	7c6e68c777	drm/amdgpu: Avoid HW GPU reset for RAS. Problem: Under certain conditions, when some IP bocks take a RAS error, we can get into a situation where a GPU reset is not possible due to issues in RAS in SMU/PSP. Temporary fix until proper solution in PSP/SMU is ready: When uncorrectable error happens the DF will unconditionally broadcast error event packets to all its clients/slave upon receiving fatal error event and freeze all its outbound queues, err_event_athub interrupt will be triggered. In such case and we use this interrupt to issue GPU reset. THe GPU reset code is modified for such case to avoid HW reset, only stops schedulers, deatches all in progress and not yet scheduled job's fences, set error code on them and signals. Also reject any new incoming job submissions from user space. All this is done to notify the applications of the problem. v2: Extract amdgpu_amdkfd_pre/post_reset from amdgpu_device_lock/unlock_adev Move amdgpu_job_stop_all_jobs_on_sched to amdgpu_job.c Remove print param from amdgpu_ras_query_error_count v3: Update based on prevoius bug fixing patch to properly call amdgpu_amdkfd_pre_reset for other XGMI hive memebers. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Acked-by: Felix Kuehling <Felix.Kuehling@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:41:05 -05:00
Christian König	9d1b3c7805	drm/amdgpu: reserve at least 4MB of VRAM for page tables v2 This hopefully helps reduce the contention for page tables. v2: adjust maximum reported VRAM size as well Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-09-13 17:38:47 -05:00
Dave Airlie	e7f7287bf5	Merge tag 'drm-next-5.4-2019-08-09' of git://people.freedesktop.org/~agd5f/linux into drm-next drm-next-5.4-2019-08-09: Same as drm-next-5.4-2019-08-06, but with the readq/writeq stuff fixed and 5.3-rc3 backmerged. amdgpu: - Add navi14 support - Add navi12 support - Add Arcturus support - Enable mclk DPM for Navi - Misc DC display fixes - Add perfmon support for DF - Add scatter/gather display support for Raven - Improve SMU handling for GPU reset - RAS support for GFX - Drop last of drmP.h - Add support for wiping memory on buffer release - Allow cursor async updates for fb swaps - Misc fixes and cleanups amdkfd: - Add navi14 support - Add navi12 support - Add Arcturus support - CWSR trap handlers updates for gfx9, 10 - Drop last of drmP.h - Update MAINTAINERS radeon: - Misc fixes and cleanups - Make kexec more reliable by tearing down the GPU ttm: - Add release_notify callback uapi: - Add wipe memory on release flag for buffer creation Signed-off-by: Dave Airlie <airlied@redhat.com> [airlied: resolved conflicts with ttm resv moving] From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20190809184807.3381-1-alexander.deucher@amd.com	2019-08-12 14:20:21 +10:00
Gerd Hoffmann	5a5011a724	drm/amdgpu: switch driver from bo->resv to bo->base.resv Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Reviewed-by: Christian König <christian.koenig@amd.com> Link: http://patchwork.freedesktop.org/patch/msgid/20190805140119.7337-14-kraxel@redhat.com	2019-08-06 08:21:54 +02:00
James Zhu	cd1fd7b381	drm/amdgpu: add harvest support for Arcturus Add VCN harvest support for Arcturus Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-18 14:18:06 -05:00
James Zhu	fa739f4b06	drm/amdgpu: add multiple instances support for Arcturus Arcturus has dual-VCN. Need add multiple instances support for Arcturus. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-18 14:18:05 -05:00
James Zhu	c01b6a1d38	drm/amdgpu: modify amdgpu_vcn to support multiple instances Arcturus has dual-VCN. Need Restruct amdgpu_device::vcn to support multiple vcns. There are no any logical changes here Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-18 14:18:05 -05:00
Evan Quan	fdafb3597a	drm/amdgpu: fix MGPU fan boost enablement for XGMI reset MGPU fan boost feature should not be enabled until all the devices from the same hive are all back from reset. Signed-off-by: Evan Quan <evan.quan@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-07-01 14:54:12 -05:00
Alex Deucher	d7929c1e13	Merge branch 'drm-next' into drm-next-5.3 Backmerge drm-next and fix up conflicts due to drmP.h removal. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-25 08:42:25 -05:00
Hawking Zhang	be9250fb96	drm/amdgpu: set the default value of pa_sc_tile_steering_override So userspace can access it. Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-21 18:58:21 -05:00
Jack Xiao	f92d5c6123	drm/amdgpu: enable the static csa when mcbp enabled CSA is the Context Save Area for preemption. Acked-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2019-06-21 09:36:54 -05:00

1 2 3 4 5

230 Commits