linux

iv/linux

Author	SHA1	Message	Date
Guchun Chen	313c8fd33e	drm/amdgpu: log on non-zero error conter per IP before GPU reset Once sync flood interrupt is triggered by RAS error, before actual GPU recovery job, it's necessary to log on and print non-zero error counter, this will help user knows where the RAS error source is from quickly. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:36:26 -05:00
changzhu	debcf83770	drm/amdgpu: add is_raven_kicker judgement for raven1 The rlc version of raven_kicer_rlc is different from the legacy rlc version of raven_rlc. So it needs to add a judgement function for raven_kicer_rlc and avoid disable GFXOFF when loading raven_kicer_rlc. Signed-off-by: changzhu <Changfeng.Zhu@amd.com> Reviewed-by: Huang Rui <ray.huang@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-19 10:33:42 -05:00
Maxime Ripard	28f2aff1ca	Linux 5.6-rc2 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl5JsVQeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGHZEH+wddtJO4dZk5TZdF KZB2w2ldsCvrBZAmas1TVvm8ncvUf+ATUZcSIzvZ3YbLmxsuLF2Cz3kD+n+36Mvy ejwq8Scl7jwnouYps/Gfd6rRj/uCafqST4qp15GMGeiy2ST4A8dJrv5IAgZhD8/N SN1bSr1AXpZ2JlEzzLDQ/NdVoNMS6IzCOsaINZcc60/XQoQZFRBWamMJFqu+CmXD SBJOybQNFJhziy45cGZSAl+67sSCcoPftwTs0Stu4CJsvFWRb3MsbNTDS51Hjcc4 3tdgGOhoNXzyzZr96MEAHmiaW4VLQv0PGgUfOajE35viMz48OrwCTru8Kbuae3XM YHV4qJk= =yiem -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXkpecgAKCRDj7w1vZxhR xU4CAQD/8ZzfYroiiEI7FHEgvnuldGYZqA9kbNAO2Vueeac0OgD+Jojewdxy+pJ9 dxfA/POdtzx3hOdN+U1YgaDC0hbXwww= =XgKq -----END PGP SIGNATURE----- Merge v5.6-rc2 into drm-misc-next Lyude needs some patches in 5.6-rc2 and we didn't bring drm-misc-next forward yet, so it looks like a good occasion. Signed-off-by: Maxime Ripard <maxime@cerno.tech>	2020-02-17 10:34:34 +01:00
Alex Deucher	b08c3ed609	drm/amdgpu/gfx10: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-14 12:58:58 -05:00
Alex Deucher	120cf95930	drm/amdgpu/gfx9: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-14 12:58:58 -05:00
Alex Deucher	c657b936ea	drm/amdgpu/soc15: fix xclk for raven It's 25 Mhz (refclk / 4). This fixes the interpretation of the rlc clock counter. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org	2020-02-14 12:58:58 -05:00
Bhawanpreet Lakha	c6f8c44044	drm/amd/display: fix dtm unloading there was a type in the terminate command. We should be calling psp_dtm_unload() instead of psp_hdcp_unload() Fixes: 143f23053333 ("drm/amdgpu: psp DTM init") Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-14 12:58:58 -05:00
Thomas Zimmermann	e3eff4b5d9	drm/amdgpu: Convert to CRTC VBLANK callbacks VBLANK callbacks in struct drm_driver are deprecated in favor of their equivalents in struct drm_crtc_funcs. Convert amdgpu over. v2: * don't wrap existing functions; change signature instead Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200123135943.24140-6-tzimmermann@suse.de	2020-02-13 13:08:13 +01:00
Thomas Zimmermann	ea702333e5	drm/amdgpu: Convert to struct drm_crtc_helper_funcs.get_scanout_position() The callback struct drm_driver.get_scanout_position() is deprecated in favor of struct drm_crtc_helper_funcs.get_scanout_position(). Convert amdgpu over. Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20200123135943.24140-5-tzimmermann@suse.de	2020-02-13 13:08:13 +01:00
Dan Carpenter	434cbcb1bd	drm/amdgpu: return -EFAULT if copy_to_user() fails The copy_to_user() function returns the number of bytes remaining to be copied, but we want to return a negative error code to the user. Fixes: 030d5b97a54b ("drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_vram_read") Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:41 -05:00
Alex Deucher	72b4c01d66	drm/amdgpu/gfx10: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:41 -05:00
Alex Deucher	e5f134958d	drm/amdgpu/gfx9: disable gfxoff when reading rlc clock Otherwise we readback all ones. Fixes rlc counter readback while gfxoff is active. Reviewed-by: Xiaojie Yuan <xiaojie.yuan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:40 -05:00
Alex Deucher	b90c4d667c	drm/amdgpu/soc15: fix xclk for raven It's 25 Mhz (refclk / 4). This fixes the interpretation of the rlc clock counter. Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:04:40 -05:00
Bhawanpreet Lakha	c786530b21	drm/amd/display: fix dtm unloading there was a type in the terminate command. We should be calling psp_dtm_unload() instead of psp_hdcp_unload() Fixes: 143f23053333 ("drm/amdgpu: psp DTM init") Signed-off-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com> Reviewed-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:03:22 -05:00
Alex Deucher	4fdda2e66d	drm/amdgpu/runpm: enable runpm on baco capable VI+ asics Seems to work reliably on VI+ except for a few so enable runpm barring those where baco for runtime power management is not supported. [rajneesh] Picked https://patchwork.freedesktop.org/patch/335402/ to enable runtime pm with baco for kfd. Also fixed a checkpatch warning and added extra checks for VEGA20 and ARCTURUS. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:01:03 -05:00
Rajneesh Bhardwaj	9593f4d6a6	drm/amdkfd: refactor runtime pm for baco So far the kfd driver implemented same routines for runtime and system wide suspend and resume (s2idle or mem). During system wide suspend the kfd aquires an atomic lock that prevents any more user processes to create queues and interact with kfd driver and amd gpu. This mechanism created problem when amdgpu device is runtime suspended with BACO enabled. Any application that relies on kfd driver fails to load because the driver reports a locked kfd device since gpu is runtime suspended. However, in an ideal case, when gpu is runtime suspended the kfd driver should be able to: - auto resume amdgpu driver whenever a client requests compute service - prevent runtime suspend for amdgpu while kfd is in use This change refactors the amdgpu and amdkfd drivers to support BACO and runtime power management. Reviewed-by: Oak Zeng <oak.zeng@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:00:54 -05:00
Rajneesh Bhardwaj	70bedd68e7	drm/amdgpu: Fix missing error check in suspend amdgpu_device_suspend might return an error code since it can be called from both runtime and system suspend flows. Add the missing return code in case of a failure. Reviewed-by: Oak Zeng <oak.zeng@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rajneesh Bhardwaj <rajneesh.bhardwaj@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-12 16:00:25 -05:00
Guchun Chen	a934f9d866	drm/amdgpu: correct comment to clear up the confusion Former comment looks to be one intended behavior in code, actually it's not. So correct it. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:40:12 -05:00
James Zhu	b5336bfd6f	drm/amdgpu/vcn2.5: fix warning Fix warning during switching to dpg pause mode for VCN firmware Version ENC: 1.1 DEC: 1 VEP: 0 Revision: 16 Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:37:22 -05:00
Guchun Chen	2cabe0d4cd	drm/amdgpu: limit GDS clearing workaround in cold boot sequence GDS clear workaround will cause gfx failure in suspend/resume case. [ 98.679559] [drm:amdgpu_device_ip_late_init [amdgpu]] ERROR late_init of IP block <gfx_v9_0> failed -110 [ 98.679561] PM: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -110 [ 98.679562] PM: Device 0000:03:00.0 failed to resume async: error -110 As this workaround is specific to the HW bug of GDS's ECC error existing in cold boot up, so bypass this workaround in suspend/ resume case after booting up. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:37:02 -05:00
Jonathan Kim	46d1da733f	drm/amdgpu: fix amdgpu pmu to use hwc->config instead of hwc->conf hwc->conf was designated specifically for AMD APU IOMMU purposes. This could cause problems in performance and/or function since APU IOMMU implementation is elsewhere. Also hwc->conf and hwc->config are different members of an anonymous union so hwc->conf aliases as hw->last_tag. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:35:54 -05:00
James Zhu	f4d0242b7b	drm/amdgpu/vcn2.5: fix DPG mode power off issue on instance 1 Support pause_state for multiple instance, and it will fix vcn2.5 DPG mode power off issue on instance 1. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 15:10:36 -05:00
Alex Deucher	f0f7ddfc34	drm/amdgpu: add flag for runtime suspend So we know whether we in are in runtime suspend or system suspend. Reviewed-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:51:52 -05:00
xinhui pan	a6605c43f9	drm/amdgpu: Do not move root PT bo to relocated list As root PD has no parent, we just need move its status to idle. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: xinhui pan <xinhui.pan@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> CC: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:51:38 -05:00
Guchun Chen	278628fa46	drm/amdgpu: correct comment to clear up the confusion Former comment looks to be one intended behavior in code, actually it's not. So correct it. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Guchun Chen <guchun.chen@amd.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:51:31 -05:00
James Zhu	3b4a18a355	drm/amdgpu/vcn2.5: fix warning Fix warning during switching to dpg pause mode for VCN firmware Version ENC: 1.1 DEC: 1 VEP: 0 Revision: 16 Signed-off-by: James Zhu <James.Zhu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:49:23 -05:00
Guchun Chen	ea6f0931c1	drm/amdgpu: limit GDS clearing workaround in cold boot sequence GDS clear workaround will cause gfx failure in suspend/resume case. [ 98.679559] [drm:amdgpu_device_ip_late_init [amdgpu]] ERROR late_init of IP block <gfx_v9_0> failed -110 [ 98.679561] PM: dpm_run_callback(): pci_pm_resume+0x0/0xa0 returns -110 [ 98.679562] PM: Device 0000:03:00.0 failed to resume async: error -110 As this workaround is specific to the HW bug of GDS's ECC error existing in cold boot up, so bypass this workaround in suspend/ resume case after booting up. Signed-off-by: Guchun Chen <guchun.chen@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-11 11:49:16 -05:00
Linus Torvalds	c16b99d6c5	drm fixes for 5.6-rc1 tegra: - merge window regression fixes nouveau: - couple of volta/turing modesetting fixes amdgpu: - EDC fixes for Arcturus - GDDR6 memory training fixe - Fix for reading gfx clockgating registers while in GFXOFF state - i2c freq fixes - Misc display fixes - TLB invalidation fix when using semaphores - VCN 2.5 instancing fixes - Switch raven1 gfxoff to a blacklist - Coreboot workaround for KV/KB - Root cause dongle fixes for display and revert workaround - Enable GPU reset for renoir and navi - Navi overclocking fixes - Fix up confusing warnings in display clock validation on raven amdkfd: - SDMA fix radeon: - Misc LUT fixes -----BEGIN PGP SIGNATURE----- iQIcBAABAgAGBQJePNasAAoJEAx081l5xIa+axgP+wZbUkJZYt75p5thhptp2CY3 235yIZQcDy0T+J/34Agv0e5DfK2hCafPIVv7dIV1A9AjWxhVzIoZmD2mHSLKnxVb MsfemKsrm1WO9KezDkDIEOgkN5rY811xTIsg89VSjyiiPXQmwAcXQBbmIBHg9YtT 7ttek1afJv7h0HD4W2JdTxLEeKn9lpZHUXESXz+xoE8OUMDizMgmbt8nWXlbFpcw Tx1EmxZ3jbCYGsrcQBOLURZiwzOb2zbXJ9o8BgNdomvtI69l2/3mMonavw+IgVq+ aiHMgWViVsDDP8ijn8uBa7EEqtGWuqTRaL/Ei71w2MGkG7S3Nz4PxbLbl5P9NHAc D3OsiaXGdNhoSGfGD6sr7TlOIEo7gjLGKJ3fMW15XmBF4isb0lwlnIB7dTiU9Zaq UBMpNA9/Vtty2MtT1SkVKQrC+Mujl4lAYssWhcrh1/RX0Ij6do4aFswy17dbVg3Q LIigj1quIfNPXONb0fwkg5JGQyOMnrYs0Q6SkrhQfSR2UzsNWnGoBRyROcZipNnQ STaM8F6eEi7RbFUaT0uWL71GBxM00PyjQgpf5fnrOiKp6rfgK3B3ThpsVoAQ2OW4 CnodlCa1uxIkIgQdcoG/3vT5B6nzpdhgRVjHt+N5sraLjOZFZHxex0YmJAYXcN+/ QR8ARNgILJvA14xCsOT7 =C0nE -----END PGP SIGNATURE----- Merge tag 'drm-next-2020-02-07' of git://anongit.freedesktop.org/drm/drm Pull drm fixes from Dave Airlie: "Just some fixes for this merge window: the tegra changes fix some regressions in the merge, nouveau has a few modesetting fixes. The amdgpu fixes are bit bigger, but they contain a couple of weeks of fixes, and don't seem to contain anything that isn't really a fix. Summary: tegra: - merge window regression fixes nouveau: - couple of volta/turing modesetting fixes amdgpu: - EDC fixes for Arcturus - GDDR6 memory training fixe - Fix for reading gfx clockgating registers while in GFXOFF state - i2c freq fixes - Misc display fixes - TLB invalidation fix when using semaphores - VCN 2.5 instancing fixes - Switch raven1 gfxoff to a blacklist - Coreboot workaround for KV/KB - Root cause dongle fixes for display and revert workaround - Enable GPU reset for renoir and navi - Navi overclocking fixes - Fix up confusing warnings in display clock validation on raven amdkfd: - SDMA fix radeon: - Misc LUT fixes" * tag 'drm-next-2020-02-07' of git://anongit.freedesktop.org/drm/drm: (90 commits) gpu: host1x: Set DMA direction only for DMA-mapped buffer objects drm/tegra: Reuse IOVA mapping where possible drm/tegra: Relax IOMMU usage criteria on old Tegra drm/amd/dm/mst: Ignore payload update failures drm/amdgpu: update default voltage for boot od table for navi1x drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_voltage drm/amdgpu/smu10: fix smu10_get_clock_by_type_with_latency drm/amdgpu/display: handle multiple numbers of fclks in dcn_calcs.c (v2) drm/amdgpu: fetch default VDDC curve voltages (v2) drm/amdgpu/smu_v11_0: Correct behavior of restoring default tables (v2) drm/amdgpu/navi10: add OD_RANGE for navi overclocking drm/amdgpu/navi: fix index for OD MCLK drm/amd/display: Fix HW/SW state mismatch drm/amd/display: Fix a typo when computing dsc configuration drm/amd/powerplay: fix navi10 system intermittent reboot issue V2 drm/amdkfd: Fix a bug in SDMA RLC queue counting under HWS mode drm/amd/display: Only enable cursor on pipes that need it drm/nouveau/kms/gv100-: avoid sending a core update until the first modeset drm/nouveau/kms/gv100-: move window ownership setup into modesetting path drm/nouveau/disp/gv100-: halt NV_PDISP_FE_RM_INTR_STAT_CTRL_DISP_ERROR storms ...	2020-02-07 12:46:08 -08:00
Christian König	dd1ab79910	drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_access_memory v2 Make use of the better performance here as well. This patch is only compile tested! v2: fix calculation bug pointed out by Felix Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:39 -05:00
Christian König	030d5b97a5	drm/amdgpu: use amdgpu_device_vram_access in amdgpu_ttm_vram_read This speeds up the access quite a bit from 2.2 MB/s to 2.9 MB/s on 32bit and 12,8 MB/s on 64bit. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:33 -05:00
Christian König	c12b84d6e0	drm/amdgpu: use the BAR if possible in amdgpu_device_vram_access v2 This should speed up debugging VRAM access a lot. v2: add HDP flush/invalidate Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:25 -05:00
Christian König	ce05ac56e6	drm/amdgpu: optimize amdgpu_device_vram_access a bit. Only write the _HI register when necessary. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Acked-by: Jonathan Kim <Jonathan.Kim@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:17 -05:00
Jonathan Kim	42d708db8e	drm/amdgpu: fix amdgpu pmu to use hwc->config instead of hwc->conf hwc->conf was designated specifically for AMD APU IOMMU purposes. This could cause problems in performance and/or function since APU IOMMU implementation is elsewhere. Also hwc->conf and hwc->config are different members of an anonymous union so hwc->conf aliases as hw->last_tag. Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:45:05 -05:00
James Zhu	80ff3e10c8	drm/amdgpu/vcn2.5: fix DPG mode power off issue on instance 1 Support pause_state for multiple instance, and it will fix vcn2.5 DPG mode power off issue on instance 1. Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:44:51 -05:00
Jack Zhang	86b93fd62d	drm/amdgpu/sriov Don't send msg when smu suspend For sriov and pp_onevf_mode, do not send message to set smu status, because smu doesn't support these messages under VF. Besides, it should skip smu_suspend when pp_onevf_mode is disabled. Signed-off-by: Jack Zhang <Jack.Zhang1@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Evan Quan <evan.quan@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-07 11:44:24 -05:00
Hawking Zhang	0b9d37609a	drm/amdgpu: move xgmi init/fini to xgmi_add/remove_device call (v2) For sriov, psp ip block has to be initialized before ih block for the dynamic register programming interface that needed for vf ih ring buffer. On the other hand, current psp ip block hw_init function will initialize xgmi session which actaully depends on interrupt to return session context. This results an empty xgmi ta session id and later failures on all the xgmi ta cmd invoked from vf. xgmi ta session initialization has to be done after ih ip block hw_init call. to unify xgmi session init/fini for both bare-metal sriov virtualization use scenario, move xgmi ta init to xgmi_add_device call, and accordingly terminate xgmi ta session in xgmi_remove_device call. The existing suspend/resume sequence will not be changed. v2: squash in return fix from Nirmoy Signed-off-by: Hawking Zhang <Hawking.Zhang@amd.com> Reviewed-by: Frank Min <Frank.Min@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-06 15:04:36 -05:00
Christian König	9f3cc18d19	drm/amdgpu: rework synchronization of VM updates v4 If provided we only sync to the BOs reservation object and no longer to the root PD. v2: update comment, cleanup amdgpu_bo_sync_wait_resv v3: use correct reservation object while clearing v4: fix typo in amdgpu_bo_sync_wait_resv Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	4939d973b6	drm/amdgpu: simplify and fix amdgpu_sync_resv No matter what we always need to sync to moves. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	fe6796ac12	drm/amdgpu: allow higher level PD invalidations Allow partial invalidation on unallocated PDs. This is useful when we need to silence faults to stop interrupt floods on Vega. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	7d28efe0c3	drm/amdgpu: return EINVAL instead of ENOENT in the VM code That we can't find a PD above the root is expected can only happen if we try to update a larger range than actually managed by the VM. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	bfcd6c69e4	drm/amdgpu: fix parentheses in amdgpu_vm_update_ptes For the root PD mask can be 0xffffffff as well which would overrun to 0 if we don't cast it before we add one. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Tom St Denis <tom.stdenis@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	46cf5f7626	drm/amdgpu: make sure to never allocate PDs/PTs for invalidations Make sure that we never allocate a page table for an invalidation operation. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	55cdd4e9b9	drm/amdgpu: drop unnecessary restriction for huge root PDEs The root PD can also contain huge PDEs. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	ef48d4b39e	drm/amdgpu: stop using amdgpu_bo_gpu_offset in the VM backend We need to update page tables without any lock held. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	5d3196605d	drm/amdgpu: rework job synchronization v2 For unlocked page table updates we need to be able to sync to fences of a specific VM. v2: use SYNC_ALWAYS in the UVD code Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	114fbc3195	drm/amdgpu: use the VM as job owner For HMM we need to rework how VM synchronization works, so instead of the filp use VM as job owner. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Christian König	81417bea87	drm/amdgpu: explicitly sync VM update to PDs/PTs Explicitly sync VM updates to the moving fence in PDs and PTs. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-02-04 23:30:39 -05:00
Nathan Chancellor	968162204a	drm/amdgpu: Fix implicit enum conversion in gfx_v9_4_ras_error_inject Clang warns: ../drivers/gpu/drm/amd/amdgpu/gfx_v9_4.c:967:35: warning: implicit conversion from enumeration type 'enum amdgpu_ras_block' to different enumeration type 'enum ta_ras_block' [-Wenum-conversion] block_info.block_id = info->head.block; ~ ~~~~~~~~~~~^~~~~ 1 warning generated. Use the function added in commit 828cfa29093f ("drm/amdgpu: Fix amdgpu ras to ta enums conversion") that handles this conversion explicitly. Fixes: 4c461d89db4f ("drm/amdgpu: add RAS support for the gfx block of Arcturus") Link: https://github.com/ClangBuiltLinux/linux/issues/849 Signed-off-by: Nathan Chancellor <natechancellor@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-30 17:15:43 -05:00
Stephen Rothwell	7044cb6c20	amdgpu: using vmalloc requires includeing vmalloc.h Fixes: 240c811ccde4 ("drm/amdgpu: fix VRAM partially encroached issue in GDDR6 memory training(V2)") Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-30 17:15:42 -05:00
Nirmoy Das	977f7e1068	drm/amdgpu: allocate entities on demand Currently we pre-allocate entities and fences for all the HW IPs on context creation and some of which are might never be used. This patch tries to resolve entity/fences wastage by creating entity only when needed. v2: allocate memory for entity and fences together Signed-off-by: Nirmoy Das <nirmoy.das@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2020-01-30 17:15:42 -05:00

... 26 27 28 29 30 ...

8299 Commits