8299 Commits

Author SHA1 Message Date
Zheng Bin
8f00d1fc9d drm/amd/amdgpu: fix comparison pointer to bool warning in amdgpu_atpx_handler.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c:619:15-49: WARNING: Comparison to bool
drivers/gpu/drm/amd/amdgpu/amdgpu_atpx_handler.c:629:15-49: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
3d0c75afdc drm/amd/amdgpu: fix comparison pointer to bool warning in uvd_v6_0.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/uvd_v6_0.c:1243:14-25: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
e66cdf250e drm/amd/amdgpu: fix comparison pointer to bool warning in si.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/si.c:1342:5-10: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
4bbbe77c15 drm/amd/amdgpu: fix comparison pointer to bool warning in sdma_v5_2.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/sdma_v5_2.c:562:5-11: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
960a06ff91 drm/amd/amdgpu: fix comparison pointer to bool warning in sdma_v5_0.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/sdma_v5_0.c:619:5-11: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
89cf8b0637 drm/amd/amdgpu: fix comparison pointer to bool warning in gfx_v10_0.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c:3563:5-31: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Zheng Bin
7b3fa67d6e drm/amd/amdgpu: fix comparison pointer to bool warning in gfx_v9_0.c
Fixes coccicheck warning:

drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c:2805:5-11: WARNING: Comparison to bool

Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Zheng Bin <zhengbin13@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:43 -04:00
Jonathan Kim
7c679ef667 drm/amdgpu: stop resetting xgmi perfmons on disable
Disabling perf events does not specify reset in ABI so stop doing it in
hardware.

Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Oak Zeng
719a6513fb drm/amdgpu: More accurate description of a function param
Add more accurate description of the pe parameter of function
amdgpu_vm_sdma_udpate and amdgpu_vm_cpu_update

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Oak Zeng
91b5900507 drm/amdgpu: Add comment to function amdgpu_ttm_alloc_gart
Add comments to refect what function does

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Reviewed-by: Christian Konig <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Andrey Grodzovsky
ce87c98db4 drm/amdgpu: Include sienna_cichlid in USBC PD FW support.
Create sysfs interface also for sienna_cichlid.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:42 -04:00
Mukul Joshi
62f6b1162e drm/amdgpu: Enable SDMA utilization for Arcturus
SDMA utilization calculations are enabled/disabled by
writing to SDMAx_PUB_DUMMY_REG2 register. Currently,
enable this only for Arcturus.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Aurabindo Pillai
5d1c59c479 drm/amdgpu: Move existing pflip fields into separate struct
[Why&How]
To refactor DM IRQ management, all fields used by IRQ is best moved
to a separate struct so that main amdgpu_crtc struct need not be changed
Location of the new struct shall be in DM

Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
John Clements
9c7e2ceb1d drm/amdgpu: Update RAS init handling
Output RAS init status

If RAS init fails, teardown RAS context

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Changfeng
f399d4de2d drm/amdgpu: add ta DTM/HDCP print in amdgpu_firmware_info for apu
It needs to add ta DTM/HDCP print to get HDCP/DTM version info when cat
amdgpu_firmware_info

Signed-off-by: Changfeng <Changfeng.Zhu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:52:39 -04:00
Andrey Grodzovsky
7cbbc745dc drm/amdgpu: Minor checkpatch fix
Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:29 -04:00
Andrey Grodzovsky
6894305c97 drm/amdgpu: Disable DPC for XGMI for now.
XGMI support is more complicated than single device support as
questions of synchronization between the device recovering from
PCI error and other members of the hive are required.
Leaving this for next round.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:22 -04:00
Andrey Grodzovsky
7ac71382e9 drm/amdgpu: Trim amdgpu_pci_slot_reset by reusing code.
Reuse exsisting functions from GPU recovery to avoid code
duplications.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:16 -04:00
Andrey Grodzovsky
c1dd4aa624 drm/amdgpu: Fix consecutive DPC recovery failures.
Cache the PCI state on boot and before each case where we might
loose it.

v2: Add pci_restore_state while caching the PCI state to avoid
breaking PCI core logic for stuff like suspend/resume.

v3: Extract pci_restore_state from amdgpu_device_cache_pci_state
to avoid superflous restores during GPU resets and suspend/resumes.

v4: Style fixes.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:25:04 -04:00
Andrey Grodzovsky
362c7b91c1 drm/amdgpu: Fix SMU error failure
Wait for HW/PSP initiated ASIC reset to complete before
starting the recovery operations.

v2: Remove typo

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:55 -04:00
Andrey Grodzovsky
acd89fca67 drm/amdgpu: Block all job scheduling activity during DPC recovery
DPC recovery involves ASIC reset just as normal GPU recovery so block
SW GPU schedulers and wait on all concurrent GPU resets.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:48 -04:00
Andrey Grodzovsky
bf36b52e78 drm/amdgpu: Avoid accessing HW when suspending SW state
At this point the ASIC is already post reset by the HW/PSP
so the HW not in proper state to be configured for suspension,
some blocks might be even gated and so best is to avoid touching it.

v2: Rename in_dpc to more meaningful name

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:39 -04:00
Andrey Grodzovsky
c9a6b82f45 drm/amdgpu: Implement DPC recovery
Add PCI Downstream Port Containment (DPC) with
basic recovery functionality

v2: remove pci_save_state to avoid breaking suspend/resume
v3: Fix style comments
v4: Improve description.

Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:32 -04:00
Liu ChengZhe
2a9787dcf5 drm/amdgpu: Do gpu recovery when no job is running
In function flr_work, we should do gpu recovery when no job
is running. Fix the logic by inverting it.

v2: modify the description

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Liu ChengZhe <ChengZhe.Liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-15 17:24:18 -04:00
Christian König
9c3006a4cc drm/ttm: remove available_caching
Instead of letting TTM make an educated guess based on
some mask all drivers should just specify what caching
they want for their CPU mappings.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/390207/
2020-09-15 16:05:19 +02:00
Christian König
0fe438cec9 drm/ttm: remove default caching
As far as I can tell this was never used either and we just
always fallback to the order cached > wc > uncached anyway.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/390142/
2020-09-15 16:03:44 +02:00
Maxime Ripard
00af6729b5
Merge drm/drm-next into drm-misc-next
Paul Cercueil needs some patches in -rc5 to apply new patches for ingenic
properly.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2020-09-14 18:11:40 +02:00
Christian König
48e07c23cb drm/ttm: nuke memory type flags
It's not supported to specify more than one of those flags.
So it never made sense to make this a flag in the first place.

Nuke the flags and specify directly which memory type to use.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/389826/?series=81551&rev=1
2020-09-11 13:31:23 +02:00
Gerd Hoffmann
707d561f77 drm: allow limiting the scatter list size.
Add drm_device argument to drm_prime_pages_to_sg(), so we can
call dma_max_mapping_size() to figure the segment size limit
and call into __sg_alloc_table_from_pages() with the correct
limit.

This fixes virtio-gpu with sev.  Possibly it'll fix other bugs
too given that drm seems to totaly ignore segment size limits
so far ...

v2: place max_segment in drm driver not gem object.
v3: move max_segment next to the other gem fields.
v4: just use dma_max_mapping_size().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: http://patchwork.freedesktop.org/patch/msgid/20200907112425.15610-2-kraxel@redhat.com
2020-09-09 07:58:56 +02:00
Dave Airlie
877d8c0743 Merge tag 'topic/nouveau-i915-dp-helpers-and-cleanup-2020-08-31-1' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
UAPI Changes:

None

Cross-subsystem Changes:

* Moves a bunch of miscellaneous DP code from the i915 driver into a set
  of shared DRM DP helpers

Core Changes:

* New DRM DP helpers (see above)

Driver Changes:

* Implements usage of the aforementioned DP helpers in the nouveau
  driver, along with some other various HPD related cleanup for nouveau

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lyude Paul <lyude@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/11e59ebdea7ee4f46803a21fe9b21443d2b9c401.camel@redhat.com
2020-09-09 12:27:13 +10:00
Dave Airlie
5d26eba988 drm/amdgpu/ttm: move to driver backend binding funcs
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907204630.1406528-9-airlied@gmail.com
2020-09-09 08:30:24 +10:00
Dave Airlie
ecfe6953fa drm/ttm: introduce ttm_bo_move_null
This pattern is cut-n-pasted across 4 drivers, switch it to
a WARN_ON instead, as BUG_ON is considered a bad idea usually.

Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200907204630.1406528-2-airlied@gmail.com
2020-09-09 08:28:53 +10:00
Christian König
54d04ea8cd drm/ttm: merge offset and base in ttm_bus_placement
This is used by TTM to communicate the physical address
which should be used with ioremap(), ioremap_wc(). We don't
need to separate the base and offset in any way here.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/389457/
2020-09-08 10:43:30 +02:00
Dave Airlie
0c8d22fcae Merge tag 'amd-drm-next-5.10-2020-09-03' of git://people.freedesktop.org/~agd5f/linux into drm-next
amd-drm-next-5.10-2020-09-03:

amdgpu:
- RAS fixes
- Sienna Cichlid updates
- Navy Flounder updates
- DCE6 (SI) support in DC
- Enable plane rotation
- Rework pre-OS vram reservation handling during driver init
- Add standard interface to dump GPU metrics table from SMU
- Rework tiling and tmz state handling in atomic commits
- Pstate fixes
- Add voltage and power hwmon interfaces for renoir
- SW CTF fixes
- S/G display fix for Raven
- Print client strings for vmfaults for vega and newer
- Manual fan control fixes
- Display updates
- Reorg power management directory structure
- Misc bug fixes
- Misc code cleanups

amdkfd:
- Topology fixes
- Add SMI events for thermal throttling and GPU resets

radeon:
- switch from pci_* to dma_* for dma allocations
- PLL fix

Scheduler:
- Clean up priority levels

UAPI:
- amdgpu INFO IOCTL query update for TMZ state
  https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049
- amdkfd SMI event interface updates
  https://github.com/RadeonOpenCompute/rocm_smi_lib/tree/therm_thrott

From: Alex Deucher <alexdeucher@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200903222921.4152-1-alexander.deucher@amd.com
2020-09-08 16:40:13 +10:00
Dave Airlie
ce5c207c6b Linux 5.9-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9VerweHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGhc4H/iHD6qLdB36gZB6K
 oc2nJyrqyWitv4ti2Mnt5PA7o4wX4l6nnr1QvoaJ4BRs5Ja1czRvb2XDmdzqAoIA
 xITGoafqaAeDfxQ91bWrJsVN0pCRKiGwddXlU7TWmqw/riAkfOqi6GYKvav4biJH
 +n1mUPQb1M2IbRFsqkAS+ebKHq3CWaRvzKOEneS88nGlL5u31S9NAru8Ru/fkxRn
 6CwGcs1XRaBPYaZAhdfIb0NuatUlpkhPC9yhNS9up6SqrWmK3m65vmFVng6H0eCF
 fwn1jVztboY/XcNAi5sM9ExpQCql6WLQEEktVikqRDojC8fVtSx6W55tPt7qeaoO
 Z6t4/DA=
 =bcA4
 -----END PGP SIGNATURE-----

Merge tag 'v5.9-rc4' into drm-next

Backmerge 5.9-rc4 as there is a nasty qxl conflict
that needs to be resolved.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2020-09-08 14:41:40 +10:00
Dave Airlie
0a667b5007 drm/ttm: remove bdev from ttm_tt
I want to split this structure up and use it differently,
step one remove bdev pointer from it and pass it explicitly.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200826014428.828392-4-airlied@gmail.com
2020-09-08 06:39:21 +10:00
Alex Deucher
11bc98bd71 drm/amdgpu/mmhub2.0: print client id string for mmhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:34 -04:00
Alex Deucher
02f23f5f7c drm/amdgpu/gmc9: print client id string for mmhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Reviewed-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:29 -04:00
Alex Deucher
93fabd84c9 drm/amdgpu/gmc10: print client id string for gfxhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:27 -04:00
Alex Deucher
be99ecbfff drm/amdgpu/gmc9: print client id string for gfxhub
Print the name of the client rather than the number.  This
makes it easier to debug what block is causing the fault.

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:23 -04:00
Ye Bin
2d37949dc3 drm/amdgpu/gfx10: Delete some duplicated argument to '|'
1. gfx_v10_0_soft_reset GRBM_STATUS__SPI_BUSY_MASK
2. gfx_v10_0_update_gfx_clock_gating AMD_CG_SUPPORT_GFX_CGLS

Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Ye Bin <yebin10@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:18 -04:00
Changfeng
6627d1c1a8 drm/amdgpu: add ta firmware load in psp_v12_0 for renoir
It needs to load renoir_ta firmware because hdcp is enabled by default
for renoir now. This can avoid error:DTM TA is not initialized

Signed-off-by: Changfeng <Changfeng.Zhu@amd.com>
Reviewed-by: Bhawanpreet Lakha <Bhawanpreet.Lakha@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:11 -04:00
Christian König
ee354ff1c7 drm/amdgpu: fix max_entries calculation v4
Calculate the correct value for max_entries or we might run after the
page_address array.

v2: Xinhui pointed out we don't need the shift
v3: use local copy of start and simplify some calculation
v4: fix the case that we map less VA range than BO size

Signed-off-by: Christian König <christian.koenig@amd.com>
Fixes: 1e691e244487 drm/amdgpu: stop allocating dummy GTT nodes
Reviewed-by: xinhui pan <xinhui.pan@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:48:04 -04:00
Luben Tuikov
1625951a3a drm/amdgpu: Remove superfluous NULL check
The DRM device is a static member of
the amdgpu device structure and as such
always exists, so long as the PCI and
thus the amdgpu device exist.

Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:47:55 -04:00
Alex Sierra
abb6fccbb4 drm/amdgpu: enable ih1 ih2 for Arcturus only
Enable multi-ring ih1 and ih2 for Arcturus only.
For Navi10 family multi-ring has been disabled.
Apparently, having multi-ring enabled in Navi was causing
continus page fault interrupts.
Further investigation is needed to get to the root cause.
Related issue link:
https://gitlab.freedesktop.org/drm/amd/-/issues/1279

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:47:48 -04:00
xinhui pan
3d7248d7ce drm/amdgpu: Fix a redundant kfree
drm_dev_alloc() alloc *dev* and set managed.final_kfree to dev to free
itself.
Now from commit 5cdd68498918("drm/amdgpu: Embed drm_device into
amdgpu_device (v3)") we alloc *adev* and ddev is just a member of it.
So drm_dev_release try to free a wrong pointer then.

Also driver's release trys to free adev, but drm_dev_release will
access dev after call drvier's release.

To fix it, remove driver's release and set managed.final_kfree to adev.

[   36.269348] BUG: unable to handle page fault for address: ffffa0c279940028
[   36.276841] #PF: supervisor read access in kernel mode
[   36.282434] #PF: error_code(0x0000) - not-present page
[   36.288053] PGD 676601067 P4D 676601067 PUD 86a414067 PMD 86a247067 PTE 800ffff8066bf060
[   36.296868] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC NOPTI
[   36.302409] CPU: 4 PID: 1375 Comm: bash Tainted: G           O      5.9.0-rc2+ #46
[   36.310670] Hardware name: System manufacturer System Product Name/PRIME Z390-A, BIOS 1401 11/26/2019
[   36.320725] RIP: 0010:drm_managed_release+0x25/0x110 [drm]
[   36.326741] Code: 80 00 00 00 00 0f 1f 44 00 00 55 48 c7 c2 5a 9f 41 c0 be 00 02 00 00 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 ec 08 <48> 8b 7f 18 e8 c2 10 ff ff 4d 8b 74 24 20 49 8d 44 24 5
[   36.347217] RSP: 0018:ffffb9424141fce0 EFLAGS: 00010282
[   36.352931] RAX: 0000000000000006 RBX: ffffa0c279940010 RCX: 0000000000000006
[   36.360718] RDX: ffffffffc0419f5a RSI: 0000000000000200 RDI: ffffa0c279940010
[   36.368503] RBP: ffffb9424141fd10 R08: 0000000000000001 R09: 0000000000000001
[   36.376304] R10: 0000000000000000 R11: 0000000000000000 R12: ffffa0c279940010
[   36.384070] R13: ffffffffc0e2a000 R14: ffffa0c26924e220 R15: fffffffffffffff2
[   36.391845] FS:  00007fc4a277b740(0000) GS:ffffa0c288e00000(0000) knlGS:0000000000000000
[   36.400669] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   36.406937] CR2: ffffa0c279940028 CR3: 0000000792304006 CR4: 00000000003706e0
[   36.414732] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   36.422550] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   36.430354] Call Trace:
[   36.433044]  drm_dev_put.part.0+0x40/0x60 [drm]
[   36.438017]  drm_dev_put+0x13/0x20 [drm]
[   36.442398]  amdgpu_pci_remove+0x56/0x60 [amdgpu]
[   36.447528]  pci_device_remove+0x3e/0xb0
[   36.451807]  device_release_driver_internal+0xff/0x1d0
[   36.457416]  device_release_driver+0x12/0x20
[   36.462094]  pci_stop_bus_device+0x70/0xa0
[   36.466588]  pci_stop_and_remove_bus_device_locked+0x1b/0x30
[   36.472786]  remove_store+0x7b/0x90
[   36.476614]  dev_attr_store+0x17/0x30
[   36.480646]  sysfs_kf_write+0x4b/0x60
[   36.484655]  kernfs_fop_write+0xe8/0x1d0
[   36.488952]  vfs_write+0xf5/0x230
[   36.492562]  ksys_write+0x70/0xf0
[   36.496206]  __x64_sys_write+0x1a/0x20
[   36.500292]  do_syscall_64+0x38/0x90
[   36.504219]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Signed-off-by: xinhui pan <xinhui.pan@amd.com>
Acked-by: Alex Deucher <alexancer.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:47:33 -04:00
Dennis Li
81202807ae drm/amdgpu: block ring buffer access during GPU recovery
When GPU is in reset, its status isn't stable and ring buffer also need
be reset when resuming. Therefore driver should protect GPU recovery
thread from ring buffer accessed by other threads. Otherwise GPU will
randomly hang during recovery.

v2: correct indent

Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dennis Li <Dennis.Li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:46:55 -04:00
Nirmoy Das
bc21585f3f drm/amdgpu: disable gpu-sched load balance for uvd
On hardware with multiple uvd instances, dependent uvd jobs
may get scheduled to different uvd instances. Because uvd_enc
jobs retain hw context, dependent jobs should always run on the
same uvd instance. This patch disables GPU scheduler's load balancer
for a context that binds jobs from the same context to a uvd
instance.

v2: Squash in uvd_enc fix

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-09-03 14:46:54 -04:00
Dave Airlie
05010c1e2f drm/amdgpu/ttm: remove unused parameter to move blit
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200826014428.828392-2-airlied@gmail.com
2020-08-31 12:41:50 +10:00
Nirmoy Das
e230ac1118 drm/amdgpu: fix compiler warnings
Fixes below compiler warnings:
 CC [M]  drivers/gpu/drm/amd/amdgpu/amdgpu_device.o
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:381:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration]
  381 | void static inline amdgpu_mm_wreg_mmio(struct amdgpu_device *adev, uint32_t reg, uint32_t v, uint32_t acc_flags)
      | ^~~~
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:381:1: warning: ‘inline’ is not at beginning of declaration [-Wold-style-declaration]
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c: In function ‘amdgpu_device_fini’:
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c:3381:6: warning: variable ‘r’ set but not used [-Wunused-but-set-variable]
 3381 |  int r;
      |      ^

Signed-off-by: Nirmoy Das <nirmoy.das@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2020-08-28 14:00:23 -04:00