1017260 Commits

Author SHA1 Message Date
Alex Deucher
3e2eae8db2 drm/amdgpu: add I2C_CLASS_HWMON to SMU i2c buses
Not sure that this really matters that much, but these could
have various other hwmon chips on them.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher
39ed82d1d9 drm/amdgpu: i2c subsystem uses 7 bit addresses
Convert from 8 bit to 7 bit.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher
25e5c09f2b drm/amdgpu/ras: switch fru eeprom handling to use generic helper (v2)
Use the new helper rather than doing i2c transfers directly.

v2: fix typo

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher
24f55c0559 drm/amdgpu/ras: switch ras eeprom handling to use generic helper
Use the new helper rather than doing i2c transfers directly.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher
00e3a289d9 drm/amdgpu: add new helper for handling EEPROM i2c transfers
Encapsulates the i2c protocol handling so other parts of the
driver can just tell it the offset and size of data to write.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher
af01340bc4 drm/amdgpu/pm: add smu i2c implementation for navi1x (v5)
And handle more than just EEPROMs.

v2: fix restart handling between transactions.
v3: handle 7 to 8 bit addr conversion
v4: Fix &req --> req. (Luben T)
v5: squash in i2c channel fix

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher
f400b6cec8 drm/amdgpu/pm: rework i2c xfers on arcturus (v5)
Make it generic so we can support more than just EEPROMs.

v2: fix restart handling between transactions.
v3: handle 7 to 8 bit addr conversion
v4: Fix &req --> req. (Luben T)
v5: squash in i2c channel fix

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher
5125c96a9d drm/amdgpu/pm: rework i2c xfers on sienna cichlid (v4)
Make it generic so we can support more than just EEPROMs.

v2: fix restart handling between transactions.
v3: handle 7 to 8 bit addr conversion
v4: Fix &req --> req. (Luben T)

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Luben Tuikov <luben.tuikov@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Alex Deucher
6963d6c176 drm/amdgpu: add a mutex for the smu11 i2c bus (v2)
So we lock software as well as hardware access to the bus.

v2: fix mutex handling.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Luben Tuikov <luben.tuikov@amd.com>
2021-07-01 00:24:39 -04:00
Mukul Joshi
93c5bcd4ea drm/amdgpu: Conditionally reset SDMA RAS error counts
Reset SDMA RAS error counts during init only if persistent
EDC harvesting is not supported.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-by: John Clements <john.clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
7981ec6549 drm/amdkfd: Maintain svm_bo reference in page->zone_device_data
Each zone-device page holds a reference to the SVM BO that manages its
backing storage. This is necessary to correctly hold on to the BO in
case zone_device pages are shared with a child-process.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
3bf8282c6b drm/amdkfd: add invalid pages debug at vram migration
This is for debug purposes only.
It conditionally generates partial migrations to test mixed
CPU/GPU memory domain pages in a prange easily.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
6ffecc946f drm/amdkfd: skip migration for pages already in VRAM
Migration skipped for pages that are already in VRAM
domain. These could be the result of previous partial
migrations to SYS RAM, and prefetch back to VRAM.
Ex. Coherent pages in VRAM that were not written/invalidated after
a copy-on-write.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
1ade5f84cc drm/amdkfd: skip invalid pages during migrations
Invalid pages can be the result of pages that have been migrated
already due to copy-on-write procedure or pages that were never
migrated to VRAM in first place. This is not an issue anymore,
as pranges now support mixed memory domains (CPU/GPU).

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
1d5dbfe6c0 drm/amdkfd: classify and map mixed svm range pages in GPU
[Why]
svm ranges can have mixed pages from device or system memory.
A good example is, after a prange has been allocated in VRAM and a
copy-on-write is triggered by a fork. This invalidates some pages
inside the prange. Endding up in mixed pages.

[How]
By classifying each page inside a prange, based on its type. Device or
system memory, during dma mapping call. If page corresponds
to VRAM domain, a flag is set to its dma_addr entry for each GPU.
Then, at the GPU page table mapping. All group of contiguous pages within
the same type are mapped with their proper pte flags.

v2:
Instead of using ttm_res to calculate vram pfns in the svm_range. It is now
done by setting the vram real physical address into drm_addr array.
This makes more flexible VRAM management, plus removes the need to have
a BO reference in the svm_range.

v3:
Remove mapping member from svm_range

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
278a708758 drm/amdkfd: use hmm range fault to get both domain pfns
Now that prange could have mixed domains (VRAM or SYSRAM),
actual_loc nor svm_bo can not be used to check its current
domain and eventually get its pfns to map them in GPU.
Instead, pfns from both domains, are now obtained from
hmm_range_fault through amdgpu_hmm_range_get_pages
call. This is done everytime a GPU map occur.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
1fc160cfe1 drm/amdgpu: get owner ref in validate and map
Get the proper owner reference for amdgpu_hmm_range_get_pages function.
This is useful for partial migrations. To avoid migrating back to
system memory, VRAM pages, that are accessible by all devices in the
same memory domain.
Ex. multiple devices in the same hive.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
a010d98a78 drm/amdkfd: set owner ref to svm range prefault
svm_range_prefault is called right before migrations to VRAM,
to make sure pages are resident in system memory before the migration.
With partial migrations, this reference is used by hmm range get pages
to avoid migrating pages that are already in the same VRAM domain.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
8c21fc49a8 drm/amdkfd: add owner ref param to get hmm pages
The parameter is used in the dev_private_owner to decide if device
pages in the range require to be migrated back to system memory, based
if they are or not in the same memory domain.
In this case, this reference could come from the same memory domain
with devices connected to the same hive.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
3a61dae854 drm/amdkfd: device pgmap owner at the svm migrate init
GPUs in the same XGMI hive have direct access to all
members'VRAM. When mapping memory to a GPU, we don't need
hmm_range_fault to fault device-private pages in the same
hive back to the host. Identifying the page owner as the hive,
rather than the individual GPU, accomplishes this.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:41 -04:00
Alex Sierra
9e4a91cd9e drm/amdkfd: inc counter on child ranges with xnack off
During GPU page table invalidation with xnack off, new ranges
split may occur concurrently in the same prange. Creating a new
child per split. Each child should also increment its
invalid counter, to assure GPU page table updates in these
ranges.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:40 -04:00
Nicholas Kazlauskas
1d40ef902d drm/amd/display: Extend DMUB diagnostic logging to DCN3.1
[Why & How]
Extend existing support for DCN2.1 DMUB diagnostic logging to
DCN3.1 so we can collect useful information if the DMUB hangs.

Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:40 -04:00
Joseph Greathouse
aa61581126 drm/amdgpu: Update NV SIMD-per-CU to 2
Navi series GPUs have 2 SIMDs per CU (and then 2 CUs per WGP).
The NV enum headers incorrectly listed this as 4, which later meant
we were incorrectly reporting the number of SIMDs in the HSA
topology. This could cause problems down the line for user-space
applications that want to launch a fixed amount of work to each
SIMD.

Signed-off-by: Joseph Greathouse <Joseph.Greathouse@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-01 00:05:18 -04:00
Alex Deucher
06ac9b6c73 drm/amdgpu: add new dimgrey cavefish DID
Add new PCI device id.

Reviewed-by: Guchun Chen <guchun.chen@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-01 00:05:18 -04:00
Shyam Sundar S K
0e2125227e drm/amd/pm: skip PrepareMp1ForUnload message in s0ix
The documentation around PrepareMp1ForUnload message says that
anything sent to SMU after this command would be stalled as the
PMFW would not be in a state to take further job requests.

Technically this is right in case of S3 scenario. But, this might
not be the case during s0ix as the PMC driver would be the last
to send the SMU on the OS_HINT. If SMU gets a PrepareMp1ForUnload
message before the OS_HINT, this would stall the entire S0ix process.

Results show that, this message to SMU is not required during S0ix
and hence skip it.

Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Shyam Sundar S K <Shyam-sundar.S-k@amd.com>
Acked-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:18 -04:00
Huang Rui
9f6a785720 drm/amdgpu: move apu flags initialization to the start of device init
In some asics, we need to adjust the behavior according to the apu flags
at very early stage.

Signed-off-by: Huang Rui <ray.huang@amd.com>
Reviewed-by: Aaron Liu <aaron.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:18 -04:00
Reka Norman
25f178bbd0 drm/amd/display: Respect CONFIG_FRAME_WARN=0 in dml Makefile
Setting CONFIG_FRAME_WARN=0 should disable 'stack frame larger than'
warnings. This is useful for example in KASAN builds. Make the dml
Makefile respect this config.

Fixes the following build warnings with CONFIG_KASAN=y and
CONFIG_FRAME_WARN=0:

drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn30/display_mode_vba_30.c:3642:6:
warning: stack frame size of 2216 bytes in function
'dml30_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than=]
drivers/gpu/drm/amd/amdgpu/../display/dc/dml/dcn31/display_mode_vba_31.c:3957:6:
warning: stack frame size of 2568 bytes in function
'dml31_ModeSupportAndSystemConfigurationFull' [-Wframe-larger-than=]

Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Reka Norman <rekanorman@google.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:18 -04:00
Jing Xiangfeng
9ba85914c3 drm/radeon: Add the missed drm_gem_object_put() in radeon_user_framebuffer_create()
radeon_user_framebuffer_create() misses to call drm_gem_object_put() in
an error path. Add the missed function call to fix it.

Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Jing Xiangfeng <jingxiangfeng@huawei.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-01 00:05:18 -04:00
Michal Suchanek
c339a80d3a drm/amdgpu/dc: Really fix DCN3.1 Makefile for PPC64
Also copy over the part that makes old gcc handling cross-platform.

Fixes: df7a1658f257 ("drm/amdgpu/dc: fix DCN3.1 Makefile for PPC64")
Fixes: 926d6972efb6 ("drm/amd/display: Add DCN3.1 blocks to the DC Makefile")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:18 -04:00
Tiezhu Yang
c1bfd74bfe drm/radeon: Call radeon_suspend_kms() in radeon_pci_shutdown() for Loongson64
On the Loongson64 platform used with Radeon GPU, shutdown or reboot failed
when console=tty is in the boot cmdline.

radeon_suspend_kms() puts the hw in the suspend state, especially set fb
state as FBINFO_STATE_SUSPENDED:

        if (fbcon) {
                console_lock();
                radeon_fbdev_set_suspend(rdev, 1);
                console_unlock();
        }

Then avoid to do any more fb operations in the related functions:

        if (p->state != FBINFO_STATE_RUNNING)
                return;

So call radeon_suspend_kms() in radeon_pci_shutdown() for Loongson64 to fix
this issue, it looks like some kind of workaround like powerpc.

Co-developed-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Jianmin Lv <lvjianmin@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-01 00:05:13 -04:00
Oak Zeng
8dbe43e99f drm/amdgpu: Set ttm caching flags during bo allocation
The ttm caching flags (ttm_cached, ttm_write_combined etc) are
used to determine a buffer object's mapping attributes in both
CPU page table and GPU page table (when that buffer is also
accessed by GPU). Currently the ttm caching flags are set in
function amdgpu_ttm_io_mem_reserve which is called during
DRM_AMDGPU_GEM_MMAP ioctl. This has a problem since the GPU
mapping of the buffer object (ioctl DRM_AMDGPU_GEM_VA) can
happen earlier than the mmap time, thus the GPU page table
update code can't pick up the right ttm caching flags to
decide the right GPU page table attributes.

This patch moves the ttm caching flags setting to function
amdgpu_vram_mgr_new - this function is called during the
first step of a buffer object create (eg, DRM_AMDGPU_GEM_CREATE)
so the later both CPU and GPU mapping function calls will
pick up this flag for CPU/GPU page table set up.

v2: rebase (Alex)

Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Suggested-by: Christian Koenig <Christian.Koenig@amd.com>
Reviewed-by: Christian Koenig <Christian.Koenig@amd.com>
Reviewed-by: Feifei Xu <Feifei.Xu@amd.com>
Tested-by: Po Huang <Po.Huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:05:12 -04:00
Guchun Chen
b66596f626 drm/amd/display: fix null pointer access in gpu reset
During GPU reset, when receiving a DMCUB OUTBUX0 interrupt,
DAL code will set it to be OUTBOX interrupt and sets hw interrupt.
However, OUTBOX interrupt is not registered yet, so a NULL pointer
access will be executed.

Call Trace:
  dal_irq_service_set+0x30/0x90 [amdgpu]
  dc_interrupt_set+0x24/0x30 [amdgpu]
  amdgpu_dm_set_dmub_outbox_irq_state+0x22/0x30 [amdgpu]
  amdgpu_irq_update+0x77/0xa0 [amdgpu]
  amdgpu_irq_gpu_reset_resume_helper+0x67/0xa0 [amdgpu]
  amdgpu_do_asic_reset+0x219/0x260 [amdgpu]
  amdgpu_device_gpu_recover.cold+0x8c5/0xb64 [amdgpu]
  amdgpu_debugfs_gpu_recover_show+0x2c/0x60 [amdgpu]
  seq_read_iter+0xc2/0x450
  ? do_anonymous_page+0x22c/0x3b0
  seq_read+0xf9/0x140
  full_proxy_read+0x5c/0x90
  vfs_read+0xaa/0x190
  ksys_read+0x67/0xe0
  __x64_sys_read+0x1a/0x20

Fixes: effbf6ca7eafda ("drm/amdgpu/display: remove an old DCN3 guard")

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-07-01 00:03:13 -04:00
Guchun Chen
e38ca7e422 drm/amd/display: fix incorrrect valid irq check
valid DAL irq should be < DAL_IRQ_SOURCES_NUMBER.

Signed-off-by: Guchun Chen <guchun.chen@amd.com>
Reviewed-and-tested-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-01 00:02:41 -04:00
Aaron Liu
e2329e74a6 drm/amdgpu: enable sdma0 tmz for Raven/Renoir(V2)
Without driver loaded, SDMA0_UTCL1_PAGE.TMZ_ENABLE is set to 1
by default for all asic. On Raven/Renoir, the sdma goldsetting
changes SDMA0_UTCL1_PAGE.TMZ_ENABLE to 0.
This patch restores SDMA0_UTCL1_PAGE.TMZ_ENABLE to 1.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Acked-by: Luben Tuikov <luben.tuikov@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-07-01 00:02:22 -04:00
Chengming Gui
a2f55040cf drm/amd/amdgpu: enable gpu recovery for beige_goby
Enable gpu recovery for beige_goby.

Signed-off-by: Chengming Gui <Jack.Gui@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:56 -04:00
Darren Powell
91161b06be amdgpu/pm: remove code duplication in show_power_cap calls
v3: updated patch to apply to latest code
 v2: reorder to check pointers before calling pm_runtime_* functions

 created generic function and call with enum from
 * amdgpu_hwmon_show_power_cap_max
 * amdgpu_hwmon_show_power_cap
 * amdgpu_hwmon_show_power_cap_default

=== Test ===
AMDGPU_PCI_ADDR=`lspci -nn | grep "VGA\|Display" | cut -d " " -f 1`
AMDGPU_HWMON=`ls -la /sys/class/hwmon | grep $AMDGPU_PCI_ADDR | cut -d " " -f 10`
HWMON_DIR=/sys/class/hwmon/${AMDGPU_HWMON}

cp pp_show_power_cap.txt{,.old}
lspci -nn | grep "VGA\|Display" > pp_show_power_cap.test.log
FILES="
power1_cap
power1_cap_max
power1_cap_default "

for f in $FILES
do
  echo  $f = `cat $HWMON_DIR/$f` >> pp_show_power_cap.test.log
done

Signed-off-by: Darren Powell <darren.powell@amd.com>
Reviewed-by: Kevin Wang <kevin1.wang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:56 -04:00
Alex Deucher
ed50995514 drm/amdgpu/display: drop unused variable
Remove unused variable.

Fixes: e7d9560aeae514 ("Revert "drm/amd/display: Fix overlay validation by considering cursors"")
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Rodrigo Siqueira
e7d9560aea Revert "drm/amd/display: Fix overlay validation by considering cursors"
This reverts commit 33f409e60eb0c59a4d0d06a62ab4642a988e17f7.

The patch that we are reverting here was originally applied because it
fixes multiple IGT issues and flickering in Android. However, after a
discussion with Sean Paul and Mark, it looks like that this patch might
cause problems on ChromeOS. For this reason, we decided to revert this
patch.

Cc: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Cc: Harry Wentland <Harry.Wentland@amd.com>
Cc: Hersen Wu <hersenxs.wu@amd.com>
Cc: Sean Paul <seanpaul@chromium.org>
Cc: Mark Yacoub <markyacoub@chromium.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com>
Reviewed-by: Sean Paul <seanpaul@chromium.org>
Reviewed-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-30 00:18:23 -04:00
Veerabadhran Gopalakrishnan
b3a24461f9 amdgpu/nv.c - Added codec query for Beige Goby
Added the Beige Goby capabilities in codec query.

v2: fix build error and indent (James)

Signed-off-by: Veerabadhran Gopalakrishnan <veerabadhran.gopalakrishnan@amd.com>
Reviewed-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Aaron Liu
c8af9390e5 drm/amdgpu: enable tmz on yellow carp
The tmz functions are verified on yellow carp. So enable it by
default.

Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Huang Rui <ray.huang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Philip Yang
d4ebc20070 drm/amdkfd: implement counters for vm fault and migration
Add helper function to get process device data structure from adev to
update counters.

Update vm faults, page_in, page_out counters will no be executed in
parallel, use WRITE_ONCE to avoid any form of compiler optimizations.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Philip Yang
751580b3ff drm/amdkfd: add sysfs counters for vm fault and migration
This is part of SVM profiling API, export sysfs counters for
per-process, per-GPU vm retry fault, pages migrated in and out of GPU vram.

counters will not be updated in parallel in GPU retry fault handler and
migration to vram/ram path, use READ_ONCE to avoid compiler
optimization.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Philip Yang
dcdb4d904b drm/amdkfd: fix sysfs kobj leak
3 cases of kobj leak, which causes memory leak:

kobj_type must have release() method to free memory from release
callback. Don't need NULL default_attrs to init kobj.

sysfs files created under kobj_status should be removed with kobj_status
as parent kobject.

Remove queue sysfs files when releasing queue from process MMU notifier
release callback.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Philip Yang
75ae84c89b drm/amdkfd: add helper function for kfd sysfs create
No functionality change. Modify kfd_sysfs_create_file to use kobject as
parameter, so it becomes common helper function to remove duplicate code
and will simplify new kfd sysfs file create in future.

Move pr_warn to helper function if sysfs file create failed. Set helper
function as void return because caller doesn't use the helper function
return value.

Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Evan Quan
ff4b601a05 drm/amdgpu: update HDP LS settings
Avoid unnecessary register programming on feature disablement.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:23 -04:00
Evan Quan
3e7fbfb40f drm/amdgpu: update GFX MGCG settings
Update GFX MGCG related settings.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:22 -04:00
Evan Quan
754e9883d4 drm/amdgpu: correct clock gating settings on feature unsupported
Clock gating setting is still performed even when the corresponding
CG feature is not supported. And the tricky part is disablement is
actually performed no matter for enablement or disablement request.
That seems not logically right.
Considering HW should already properly take care of the CG state, we
will just skip the corresponding clock gating setting when the feature
is not supported.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:18:22 -04:00
Evan Quan
adcf949e66 drm/amdgpu: fix the hang caused by PCIe link width switch
SMU had set all the necessary fields for a link width switch
but the width switch wasn't occurring because the link was idle
in the L1 state. Setting LC_L1_RECONFIG_EN=0x1 will allow width
switches to also be initiated while in L1 instead of waiting until
the link is back in L0.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-30 00:18:14 -04:00
Evan Quan
5a5da8ae95 drm/amdgpu: fix NAK-G generation during PCI-e link width switch
A lot of NAK-G being generated when link widht switching is happening.
WA for this issue is to program the SPC to 4 symbols per clock during
bootup when the native PCIE width is x4.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
2021-06-30 00:17:56 -04:00
Evan Quan
9c26ddb1c5 drm/amdgpu: fix Navi1x tcp power gating hang when issuing lightweight invalidaiton
Fix TCP hang when a lightweight invalidation happens on Navi1x.

Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2021-06-30 00:17:47 -04:00