59 Commits

Author SHA1 Message Date
Matthew Auld
503725c2d9 drm/i915/ttm: mappable migration on fault
The end goal is to have userspace tell the kernel what buffers will
require CPU access, however if we ever reach the CPU fault handler, and
the current resource is not mappable, then we should attempt to migrate
the buffer to the mappable portion of LMEM, or even system memory, if the
allowable placements permit it.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228123607.580432-2-matthew.auld@intel.com
2022-03-01 08:50:46 +00:00
Matthew Auld
9373505967 drm/i915/ttm: make eviction mappable aware
If we need to make room for some mappable object, then we should
only victimize objects that have one or pages that occupy the visible
portion of LMEM. Let's also create a new priority hint for objects that
are placed in mappable memory, where we know that CPU access was
requested, that way we hopefully victimize these last.

v2(Thomas): s/TTM_PL_PRIV/I915_PL_LMEM0/

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228123607.580432-1-matthew.auld@intel.com
2022-03-01 08:50:45 +00:00
Matthew Auld
30b9d1b3ef drm/i915: add I915_BO_ALLOC_GPU_ONLY
If the user doesn't require CPU access for the buffer, then
ALLOC_GPU_ONLY should be used, in order to prioritise allocating in the
non-mappable portion of LMEM, on devices with small BAR.

v2(Thomas):
  - The BO_ALLOC_TOPDOWN naming here is poor, since this is pure lies on
    systems that don't even have small BAR. A better name is GPU_ONLY,
    which is accurate regardless of the configuration.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225145502.331818-3-matthew.auld@intel.com
2022-02-28 08:47:34 +00:00
Matthew Auld
3312a4ac8a drm/i915/ttm: require mappable by default
On devices with non-mappable LMEM ensure we always allocate the pages
within the mappable portion. For now we assume that all LMEM buffers
will require CPU access, which is also inline with pretty much all
current kernel internal users. In the next patch we will introduce a new
flag to override this behaviour.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225145502.331818-2-matthew.auld@intel.com
2022-02-28 08:47:34 +00:00
Matthew Auld
235582ca96 drm/i915: add io_size plumbing
With small LMEM-BAR we need to be able to differentiate between the
total size of LMEM, and how much of it is CPU mappable. The end goal is
to be able to utilize the entire range, even if part of is it not CPU
accessible.

v2: also update intelfb_create

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225145502.331818-1-matthew.auld@intel.com
2022-02-28 08:47:27 +00:00
Rodrigo Vivi
30424ebae8 Merge tag 'drm-intel-gt-next-2022-02-17' of git://anongit.freedesktop.org/drm/drm-intel into drm-intel-next
UAPI Changes:

- Weak parallel submission support for execlists

  Minimal implementation of the parallel submission support for
  execlists backend that was previously only implemented for GuC.
  Support one sibling non-virtual engine.

Core Changes:

- Two backmerges of drm/drm-next for header file renames/changes and
  i915_regs reorganization

Driver Changes:

- Add new DG2 subplatform: DG2-G12 (Matt R)
- Add new DG2 workarounds (Matt R, Ram, Bruce)
- Handle pre-programmed WOPCM registers for DG2+ (Daniele)
- Update guc shim control programming on XeHP SDV+ (Daniele)
- Add RPL-S C0/D0 stepping information (Anusha)
- Improve GuC ADS initialization to work on ARM64 on dGFX (Lucas)

- Fix KMD and GuC race on accessing PMU busyness (Umesh)
- Use PM timestamp instead of RING TIMESTAMP for reference in PMU with GuC (Umesh)
- Report error on invalid reset notification from GuC (John)
- Avoid WARN splat by holding RPM wakelock during PXP unbind (Juston)
- Fixes to parallel submission implementation (Matt B.)
- Improve GuC loading status check/error reports (John)
- Tweak TTM LRU priority hint selection (Matt A.)
- Align the plane_vma to min_page_size of stolen mem (Ram)

- Introduce vma resources and implement async unbinding (Thomas)
- Use struct vma_resource instead of struct vma_snapshot (Thomas)
- Return some TTM accel move errors instead of trying memcpy move (Thomas)
- Fix a race between vma / object destruction and unbinding (Thomas)
- Remove short-term pins from execbuf (Maarten)
- Update to GuC version 69.0.3 (John, Michal Wa.)
- Improvements to GT reset paths in GuC backend (Matt B.)
- Use shrinker_release_pages instead of writeback in shmem object hooks (Matt A., Tvrtko)
- Use trylock instead of blocking lock when freeing GEM objects (Maarten)
- Allocate intel_engine_coredump_alloc with ALLOW_FAIL (Matt B.)
- Fixes to object unmapping and purging (Matt A)
- Check for wedged device in GuC backend (John)
- Avoid lockdep splat by locking dpt_obj around set_cache_level (Maarten)
- Allow dead vm to unbind vma's without lock (Maarten)
- s/engine->i915/i915/ for DG2 engine workarounds (Matt R)

- Use to_gt() helper for GGTT accesses (Michal Wi.)
- Selftest improvements (Matt B., Thomas, Ram)
- Coding style and compiler warning fixes (Matt B., Jasmine, Andi, Colin, Gustavo, Dan)

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Yg4i2aCZvvee5Eai@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Fixed conflicts while applying, using the fixups/drm-intel-gt-next.patch
from drm-rerere's 1f2b1742abdd ("2022y-02m-23d-16h-07m-57s UTC: drm-tip
rerere cache update")]
2022-02-23 15:03:51 -05:00
Jani Nikula
82508de228 drm/i915: include shmem_fs.h only where needed
Don't include shmem_fs.h in i915_drv.h, reducing the build dependencies.

Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/44eade17f7ba1480d67c584466eeea3553f31506.1644507885.git.jani.nikula@intel.com
2022-02-14 13:39:49 +02:00
Matthew Auld
ba2c5d1502 drm/i915/ttm: tweak priority hint selection
For some reason we are selecting PRIO_HAS_PAGES when we don't have
mm.pages, and vice versa.

v2(Thomas):
  - Add missing fixes tag

Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209111652.468762-1-matthew.auld@intel.com
2022-02-10 11:54:50 +00:00
Matthew Auld
6ef295e342 drm/i915/ttm: ensure we unmap when purging
Purging can happen during swapping out, or directly invoked with the
madvise ioctl. In such cases this doesn't involve a ttm move, which
skips umapping the object.

v2(Thomas):
- add ttm_truncate helper, and just call into i915_ttm_move_notify() to
  handle the unmapping step

Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-4-matthew.auld@intel.com
(cherry picked from commit ab4911b7d411ab2ef3b38322178b9138e156c393)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-01-10 14:00:47 +00:00
Matthew Auld
8ee262ba79 drm/i915/ttm: add unmap_virtual callback
Ensure we call ttm_bo_unmap_virtual when releasing the pages.
Importantly this should now handle the ttm swapping case, and all other
places that already call into i915_ttm_move_notify().

v2: fix up the selftest

Fixes: cf3e3e86d779 ("drm/i915: Use ttm mmap handling for ttm bo's.")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-3-matthew.auld@intel.com
(cherry picked from commit 903e0387270eef14a711c0feb23b7bf62d2480df)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-01-10 13:55:07 +00:00
Matthew Auld
03ee595678 drm/i915/ttm: only fault WILLNEED objects
Don't attempt to fault and re-populate purged objects. By some fluke
this passes the dontneed-after-mmap IGT, but for the wrong reasons.

Fixes: cf3e3e86d779 ("drm/i915: Use ttm mmap handling for ttm bo's.")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-2-matthew.auld@intel.com
(cherry picked from commit f3cb4a2de5410147b53e53416a3af0ffe26b5f4e)
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
2022-01-10 13:55:02 +00:00
Matthew Auld
ab4911b7d4 drm/i915/ttm: ensure we unmap when purging
Purging can happen during swapping out, or directly invoked with the
madvise ioctl. In such cases this doesn't involve a ttm move, which
skips umapping the object.

v2(Thomas):
- add ttm_truncate helper, and just call into i915_ttm_move_notify() to
  handle the unmapping step

Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-4-matthew.auld@intel.com
2022-01-10 11:11:32 +00:00
Matthew Auld
903e038727 drm/i915/ttm: add unmap_virtual callback
Ensure we call ttm_bo_unmap_virtual when releasing the pages.
Importantly this should now handle the ttm swapping case, and all other
places that already call into i915_ttm_move_notify().

v2: fix up the selftest

Fixes: cf3e3e86d779 ("drm/i915: Use ttm mmap handling for ttm bo's.")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-3-matthew.auld@intel.com
2022-01-10 11:01:44 +00:00
Matthew Auld
f3cb4a2de5 drm/i915/ttm: only fault WILLNEED objects
Don't attempt to fault and re-populate purged objects. By some fluke
this passes the dontneed-after-mmap IGT, but for the wrong reasons.

Fixes: cf3e3e86d779 ("drm/i915: Use ttm mmap handling for ttm bo's.")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220106174910.280616-2-matthew.auld@intel.com
2022-01-10 11:01:42 +00:00
Matthew Auld
ffa3fe080c drm/i915: clean up shrinker_release_pages
Add some proper flags for the different modes, and shorten the name to
something more snappy.

Suggested-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211215110746.865-2-matthew.auld@intel.com
2022-01-10 10:49:50 +00:00
Robert Beckett
5719d4fee1 drm/i915/ttm: fix large buffer population trucation
ttm->num_pages is uint32_t which was causing very large buffers to
only populate a truncated size.

This fixes gem_create@create-clear igt test on large memory systems.

Fixes: 7ae034590cea ("drm/i915/ttm: add tt shmem backend")
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211210195005.2582884-1-bob.beckett@collabora.com
2021-12-14 14:40:10 +00:00
Thomas Hellström
6385eb7ad8 drm/i915/ttm: Implement asynchronous TTM moves
Don't wait sync while migrating, but rather make the GPU blit await the
dependencies and add a moving fence to the object.

This also enables asynchronous VRAM management in that on eviction,
rather than waiting for the moving fence to expire before freeing VRAM,
it is freed immediately and the fence is stored with the VRAM manager and
handed out to newly allocated objects to await before clears and swapins,
or for kernel objects before setting up gpu vmas or mapping.

To collect dependencies before migrating, add a set of utilities that
coalesce these to a single dma_fence.

What is still missing for fully asynchronous operation is asynchronous vma
unbinding, which is still to be implemented.

This commit substantially reduces execution time in the gem_lmem_swapping
test.

v2:
- Make a couple of functions static.
v4:
- Fix some style issues (Matthew Auld)
- Audit and add more checks for ghost objects (Matthew Auld)
- Add more documentation for the i915_deps utility (Mattew Auld)
- Simplify the i915_deps_sync() function
v6:
- Re-check for fence signaled before returning -EBUSY (Matthew Auld)
- Use dma_resv_iter_is_exclusive() (Matthew Auld)
- Await all dma-resv fences before a migration blit (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-6-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:19 +01:00
Thomas Hellström
004746e4b1 drm/i915/ttm: Correctly handle waiting for gpu when shrinking
With async migration, the shrinker may end up wanting to release the
pages of an object while the migration blit is still running, since
the GT migration code doesn't set up VMAs and the shrinker is thus
oblivious to the fact that the GPU is still using the pages.

Add waiting for gpu in the shrinker_release_pages() op and an
argument to that function indicating whether the shrinker expects it
to not wait for gpu. In the latter case the shrinker_release_pages()
op will return -EBUSY if the object is not idle.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-5-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:18 +01:00
Thomas Hellström
8b1f7f92e5 drm/i915/ttm: Drop region reference counting
There is an interesting refcounting loop:
struct intel_memory_region has a struct ttm_resource_manager,
ttm_resource_manager->move may hold a reference to i915_request,
i915_request may hold a reference to intel_context,
intel_context may hold a reference to drm_i915_gem_object,
drm_i915_gem_object may hold a reference to intel_memory_region.

Break this loop by dropping region reference counting.

In addition, Have regions with a manager moving fence make sure
that all region objects are released before freeing the region.

v6:
- Fix a code comment.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-4-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:16 +01:00
Thomas Hellström
05d1c76107 drm/i915/ttm: Move the i915_gem_obj_copy_ttm() function
Move the i915_gem_obj_copy_ttm() function to i915_gem_ttm_move.h.
This will help keep a number of functions static when introducing
async moves.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-3-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:15 +01:00
Matthew Auld
be373fad54 drm/i915/ttm: fixup build failure
drm-intel-gt-next fails to build with:

drivers/gpu/drm/i915/gem/i915_gem_ttm.c: In function ‘vm_fault_ttm’:
drivers/gpu/drm/i915/gem/i915_gem_ttm.c:862:23: error: too many arguments to function ‘ttm_bo_vm_fault_reserved’
  862 |                 ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
      |                       ^~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211123125814.1703220-1-matthew.auld@intel.com
2021-11-23 16:36:22 +00:00
Dan Carpenter
6164807dd2 drm/i915/ttm: Fix error code in i915_ttm_eviction_valuable()
This function returns a bool type so returning -EBUSY is equivalent to
returning true.  It should return false instead.

Fixes: 7ae034590cea ("drm/i915/ttm: add tt shmem backend")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122061438.GA2492@kili
2021-11-22 14:20:44 +00:00
Thomas Hellström
d3cb30f8dc drm/i915/ttm: Fix illegal addition to shrinker list
There's a small window of opportunity during which the adjust_lru()
function can be called with a GEM refcount of zero from the TTM
eviction code. This results in a kernel BUG().

Ensure that we don't attempt to modify the GEM shrinker lists unless
we have a GEM refcount.

Fixes: ebd4a8ec7799 ("drm/i915/ttm: move shrinker management into adjust_lru")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110085527.1033475-1-thomas.hellstrom@linux.intel.com
2021-11-11 09:00:00 +01:00
Thomas Hellström
3589fdbd3b drm/i915/ttm: Reorganize the ttm move code
We are about to introduce failsafe- and asynchronous migration and
ttm moves.
This will add complexity and code to the TTM move code so it makes sense
to split it out to a separate file to make the i915 TTM code easer to
digest.
Split the i915 TTM move code out and since we will have to change the name
of the gpu_binds_iomem() and cpu_maps_iomem() functions anyway,
we alter the name of gpu_binds_iomem() to i915_ttm_gtt_binds_lmem() which
is more reflecting what it is used for.
With this we also add some more documentation. Otherwise there should be
no functional change.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104110718.688420-2-thomas.hellstrom@linux.intel.com
2021-11-05 09:05:30 +01:00
Thomas Hellström
cad7109a2b drm/i915: Introduce refcounted sg-tables
As we start to introduce asynchronous failsafe object migration,
where we update the object state and then submit asynchronous
commands we need to record what memory resources are actually used
by various part of the command stream. Initially for three purposes:

1) Error capture.
2) Asynchronous migration error recovery.
3) Asynchronous vma bind.

At the time where these happens, the object state may have been updated
to be several migrations ahead and object sg-tables discarded.

In order to make it possible to keep sg-tables with memory resource
information for these operations, introduce refcounted sg-tables that
aren't freed until the last user is done with them.

The alternative would be to reference information sitting on the
corresponding ttm_resources which typically have the same lifetime as
these refcountes sg_tables, but that leads to other awkward constructs:
Due to the design direction chosen for ttm resource managers that would
lead to diamond-style inheritance, the LMEM resources may sometimes be
prematurely freed, and finally the subclassed struct ttm_resource would
have to bleed into the asynchronous vma bind code.

v3:
- Address a number of style issues (Matthew Auld)
v4:
- Dont check for st->sgl being NULL in i915_ttm_tt__shmem_unpopulate(),
  that should never happen. (Matthew Auld)
v5:
- Fix a Potential double-free (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211101122444.114607-1-thomas.hellstrom@linux.intel.com
2021-11-01 18:10:49 +01:00
Matthew Auld
5d12ffe6be drm/i915/ttm: enable shmem tt backend
Turn on the shmem tt backend, and enable shrinking.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-8-matthew.auld@intel.com
2021-10-22 13:19:30 +01:00
Matthew Auld
2eda4fc6d0 drm/i915/ttm: use cached system pages when evicting lmem
This should let us do an accelerated copy directly to the shmem pages
when temporarily moving lmem-only objects, where the i915-gem shrinker
can later kick in to swap out the pages, if needed.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-7-matthew.auld@intel.com
2021-10-22 13:19:29 +01:00
Matthew Auld
ebd4a8ec77 drm/i915/ttm: move shrinker management into adjust_lru
We currently just evict lmem objects to system memory when under memory
pressure. For this case we might lack the usual object mm.pages, which
effectively hides the pages from the i915-gem shrinker, until we
actually "attach" the TT to the object, or in the case of lmem-only
objects it just gets migrated back to lmem when touched again.

For all cases we can just adjust the i915 shrinker LRU each time we also
adjust the TTM LRU. The two cases we care about are:

  1) When something is moved by TTM, including when initially populating
     an object. Importantly this covers the case where TTM moves something from
     lmem <-> smem, outside of the normal get_pages() interface, which
     should still ensure the shmem pages underneath are reclaimable.

  2) When calling into i915_gem_object_unlock(). The unlock should
     ensure the object is removed from the shinker LRU, if it was indeed
     swapped out, or just purged, when the shrinker drops the object lock.

v2(Thomas):
  - Handle managing the shrinker LRU in adjust_lru, where it is always
    safe to touch the object.
v3(Thomas):
  - Pretty much a re-write. This time piggy back off the shrink_pin
    stuff, which actually seems to fit quite well for what we want here.
v4(Thomas):
  - Just use a simple boolean for tracking ttm_shrinkable.
v5:
  - Ensure we call adjust_lru when faulting the object, to ensure the
    pages are visible to the shrinker, if needed.
  - Add back the adjust_lru when in i915_ttm_move (Thomas)
v6(Reported-by: kernel test robot <lkp@intel.com>):
  - Remove unused i915_tt

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #v4
Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-6-matthew.auld@intel.com
2021-10-22 13:19:26 +01:00
Matthew Auld
7ae034590c drm/i915/ttm: add tt shmem backend
For cached objects we can allocate our pages directly in shmem. This
should make it possible(in a later patch) to utilise the existing
i915-gem shrinker code for such objects. For now this is still disabled.

v2(Thomas):
  - Add optional try_to_writeback hook for objects. Importantly we need
    to check if the object is even still shrinkable; in between us
    dropping the shrinker LRU lock and acquiring the object lock it could for
    example have been moved. Also we need to differentiate between
    "lazy" shrinking and the immediate writeback mode. Also later we need to
    handle objects which don't even have mm.pages, so bundling this into
    put_pages() would require somehow handling that edge case, hence
    just letting the ttm backend handle everything in try_to_writeback
    doesn't seem too bad.
v3(Thomas):
  - Likely a bad idea to touch the object from the unpopulate hook,
    since it's not possible to hold a reference, without also creating
    circular dependency, so likely this is too fragile. For now just
    ensure we at least mark the pages as dirty/accessed when called from the
    shrinker on WILLNEED objects.
  - s/try_to_writeback/shrinker_release_pages, since this can do more
    than just writeback.
  - Get rid of do_backup boolean and just set the SWAPPED flag prior to
    calling unpopulate.
  - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
    these just get skipped anyway. We can try to come up with something
    better later.
v4(Thomas):
  - s/PCI_DMA/DMA/. Also drop NO_KERNEL_MAPPING and NO_WARN, which
    apparently doesn't do anything with streaming mappings.
  - Just pass along the error for ->truncate, and assume nothing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Oak Zeng <oak.zeng@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-2-matthew.auld@intel.com
2021-10-22 13:19:20 +01:00
Dave Airlie
1176d15f0f Merge tag 'drm-intel-gt-next-2021-10-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:

- Add uAPI for using PXP protected objects

  Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8064

- Add PCI IDs and LMEM discovery/placement uAPI for DG1

  Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584

- Disable engine bonding on Gen12+ except TGL, RKL and ADL-S

Cross-subsystem Changes:

- Merges 'tip/locking/wwmutex' branch (core kernel tip)
- "mei: pxp: export pavp client to me client bus"

Core Changes:

- Update ttm_move_memcpy for async use (Thomas)

Driver Changes:

- Enable GuC submission by default on DG1 (Matt B)
- Add PXP (Protected Xe Path) support for Gen12 integrated (Daniele,
  Sean, Anshuman)
  See "drm/i915/pxp: add PXP documentation" for details!
- Remove force_probe protection for ADL-S (Raviteja)
- Add base support for XeHP/XeHP SDV (Matt R, Stuart, Lucas)
- Handle DRI_PRIME=1 on Intel igfx + Intel dgfx hybrid graphics setup (Tvrtko)
- Use Transparent Hugepages when IOMMU is enabled (Tvrtko, Chris)
- Implement LMEM backup and restore for suspend / resume (Thomas)
- Report INSTDONE_GEOM values in error state for DG2 (Matt R)
- Add DG2-specific shadow register table (Matt R)
- Update Gen11/Gen12/XeHP shadow register tables (Matt R)
- Maintain backward-compatible nested batch behavior on TGL+ (Matt R)
- Add new LRI reg offsets for DG2 (Akeem)
- Initialize unused MOCS entries to device specific values (Ayaz)
- Track and use the correct UC MOCS index on Gen12 (Ayaz)
- Add separate MOCS table for Gen12 devices other than TGL/RKL (Ayaz)
- Simplify the locking and eliminate some RCU usage (Daniel)
- Add some flushing for the 64K GTT path (Matt A)
- Mark GPU wedging on driver unregister unrecoverable (Janusz)

- Major rework in the GuC codebase, simplify locking and add docs (Matt B)
- Add DG1 GuC/HuC firmwares (Daniele, Matt B)
- Remember to call i915_sw_fence_fini on guc_state.blocked (Matt A)
- Use "gt" forcewake domain name for error messages instead of "blitter" (Matt R)
- Drop now duplicate LMEM uAPI RFC kerneldoc section (Daniel)
- Fix early tracepoints for requests (Matt A)
- Use locked access to ctx->engines in set_priority (Daniel)
- Convert gen6/gen7/gen8 read operations to fwtable (Matt R)
- Drop gen11/gen12 specific mmio write handlers (Matt R)
- Drop gen11 specific mmio read handlers (Matt R)
- Use designated initializers for init/exit table (Kees)
- Fix syncmap memory leak (Matt B)
- Add pretty printing for buddy allocator state debug (Matt A)
- Fix potential error pointer dereference in pinned_context() (Dan)
- Remove IS_ACTIVE macro (Lucas)
- Static code checker fixes (Nathan)
- Clean up disabled warnings (Nathan)
- Increase timeout in i915_gem_contexts selftests 5x for GuC submission (Matt B)
- Ensure wa_init_finish() is called for ctx workaround list (Matt R)
- Initialize L3CC table in mocs init (Sreedhar, Ayaz, Ram)
- Get PM ref before accessing HW register (Vinay)
- Move __i915_gem_free_object to ttm_bo_destroy (Maarten)
- Deduplicate frequency dump on debugfs (Lucas)
- Make wa list per-gt (Venkata)
- Do not define dummy vma in stack (Venkata)
- Take pinning into account in __i915_gem_object_is_lmem (Matt B, Thomas)
- Do not report currently active engine when describing objects (Tvrtko)
- Fix pdfdocs build error by removing nested grid from GuC docs (Akira)
- Remove false warning from the rps worker (Tejas)
- Flush buffer pools on driver remove (Janusz)
- Fix runtime pm handling in i915_gem_shrink (Maarten)
- Rework TTM object initialization slightly (Thomas)
- Use fixed offset for PTEs location (Michal Wa)
- Verify result from CTB (de)register action and improve error messages (Michal Wa)
- Fix bug in user proto-context creation that leaked contexts (Matt B)

- Re-use Gen11 forcewake read functions on Gen12 (Matt R)
- Make shadow tables range-based (Matt R)
- Ditch the i915_gem_ww_ctx loop member (Thomas, Maarten)
- Use NULL instead of 0 where appropriate (Ville)
- Rename pci/debugfs functions to respect file prefix (Jani, Lucas)
- Drop guc_communication_enabled (Daniele)
- Selftest fixes (Thomas, Daniel, Matt A, Maarten)
- Clean up inconsistent indenting (Colin)
- Use direction definition DMA_BIDIRECTIONAL instead of
  PCI_DMA_BIDIRECTIONAL (Cai)
- Add "intel_" as prefix in set_mocs_index() (Ayaz)

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YWAO80MB2eyToYoy@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-10-11 18:09:39 +10:00
Thomas Hellström
068396bb21 drm/i915/ttm: Rework object initialization slightly
We may end up in i915_ttm_bo_destroy() in an error path before the
object is fully initialized. In that case it's not correct to call
__i915_gem_free_object(), because that function
a) Assumes the gem object refcount is 0, which it isn't.
b) frees the placements which are owned by the caller until the
init_object() region ops returns successfully. Fix this by providing
a lightweight cleanup function __i915_gem_object_fini() which is also
called by __i915_gem_free_object().

While doing this, also make sure we call dma_resv_fini() as part of
ordinary object destruction and not from the RCU callback that frees
the object. This will help track down bugs where the object is incorrectly
locked from an RCU lookup.

Finally, make sure the object isn't put on the region list until it's
either locked or fully initialized in order to block list processing of
partially initialized objects.

v2:
- The TTM object backend memory was freed before the gem pages were
  put. Separate this functionality into __i915_gem_object_pages_fini()
  and call it from the TTM delete_mem_notify() callback.
v3:
- Include i915_gem_object_free_mmaps() in __i915_gem_object_pages_fini()
  to make sure we don't inadvertedly introduce a race.

Fixes: 48b096126954 ("drm/i915: Move __i915_gem_free_object to ttm_bo_destroy")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20210930113236.583531-1-thomas.hellstrom@linux.intel.com
2021-10-01 13:11:58 +02:00
Matthew Auld
43d46f0b78 drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/
It covers more than just ttm_bo_type_sg usage, like with say dma-buf,
since one other user is userptr in amdgpu, and in the future we might
have some more. Hence EXTERNAL is likely a more suitable name.

v2(Christian):
  - Rename these to TTM_TT_FLAGS_*
  - Fix up all the holes in the flag values

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210929132629.353541-1-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-09-29 16:17:56 +02:00
Thomas Hellström
c56ce95653 drm/i915 Implement LMEM backup and restore for suspend / resume
Just evict unpinned objects to system. For pinned LMEM objects,
make a backup system object and blit the contents to that.

Backup is performed in three steps,
1: Opportunistically evict evictable objects using the gpu blitter.
2: After gt idle, evict evictable objects using the gpu blitter. This will
be modified in an upcoming patch to backup pinned objects that are not used
by the blitter itself.
3: Backup remaining pinned objects using memcpy.

Also move uC suspend to after 2) to make sure we have a functional GuC
during 2) if using GuC submission.

v2:
- Major refactor to make sure gem_exec_suspend@hang-SX subtests work, and
  suspend / resume works with a slightly modified GuC submission enabling
  patch series.

v3:
- Fix a potential use-after-free (Matthew Auld)
- Use i915_gem_object_create_shmem() instead of
  i915_gem_object_create_region (Matthew Auld)
- Minor simplifications (Matthew Auld)
- Fix up kerneldoc for i195_ttm_restore_region().
- Final lmem_suspend() call moved to i915_gem_backup_suspend from
  i915_gem_suspend_late, since the latter gets called at driver unload
  and we don't unnecessarily want to run it at that time.

v4:
- Interface change of ttm- & lmem suspend / resume functions to use
  flags rather than bools. (Matthew Auld)
- Completely drop the i915_gem_backup_suspend change (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-5-thomas.hellstrom@linux.intel.com
2021-09-24 08:19:11 +02:00
Thomas Hellström
0d9388635a drm/i915/ttm: Implement a function to copy the contents of two TTM-based objects
When backing up or restoring contents of pinned objects at suspend /
resume time we need to allocate a new object as the backup. Add a function
to facilitate copies between the two. Some data needs to be copied before
the migration context is ready for operation, so make sure we can
disable accelerated copies.

v2:
- Fix a missing return value check (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-2-thomas.hellstrom@linux.intel.com
2021-09-24 08:19:09 +02:00
Maarten Lankhorst
48b0961269 drm/i915: Move __i915_gem_free_object to ttm_bo_destroy
When we implement delayed destroy, we may have a second
call to the delete_mem_notify() handler, while free_object()
only should be called once.

Move it to bo->destroy(), to ensure it's only called once.
This fixes some weird memory corruption issues with delayed
destroy when async eviction is used.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210830121006.2978297-2-maarten.lankhorst@linux.intel.com
Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2021-09-16 15:30:27 +02:00
Joonas Lahtinen
d5dd580deb Merge drm/drm-next into drm-intel-gt-next
Close the divergence which has caused patches not to apply and
have a solid baseline for the PXP patches that Rodrigo will send
a topic branch PR for.

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2021-09-15 13:23:27 +03:00
Maxime Ripard
2f76520561
Merge drm/drm-next into drm-misc-next
Kickstart new drm-misc-next cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-09-14 09:25:30 +02:00
Linus Torvalds
23852bec53 RDMA v5.15 merge window Pull Request
- Various cleanup and small features for rtrs
 
 - kmap_local_page() conversions
 
 - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns
 
 - Cache the IB subnet prefix
 
 - Rework how CRC is calcuated in rxe
 
 - Clean reference counting in iwpm's netlink
 
 - Pull object allocation and lifecycle for user QPs to the uverbs core
   code
 
 - Several small hns features and continued general code cleanups
 
 - Fix the scatterlist confusion of orig_nents/nents introduced in an
   earlier patch creating the append operation
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmEudRgACgkQOG33FX4g
 mxraJA//c6bMxrrTVrzmrtrkyYD4tYWE8RDfgvoyZtleZnnEOJeunCQWakQrpJSv
 ukSnOGCA3PtnmRMdV54f/11YJ/7otxOJodSO7jWsIoBrqG/lISAdX8mn2iHhrvJ0
 dIaFEFPLy0WqoMLCJVIYIupR0IStVHb/mWx0uYL4XnnoYKyt7f7K5JMZpNWMhDN2
 ieJw0jfrvEYm8pipWuxUvB16XARlzAWQrjqLpMRI+jFRpbDVBY21dz2/LJvOJPrA
 LcQ+XXsV/F659ibOAGm6bU4BMda8fE6Lw90B/gmhSswJ205NrdziF5cNYHP0QxcN
 oMjrjSWWHc9GEE7MTipC2AH8e36qob16Q7CK+zHEJ+ds7R6/O/8XmED1L8/KFpNA
 FGqnjxnxsl1y27mUegfj1Hh8PfoDp2oVq0lmpEw0CYo4cfVzHSMRrbTR//XmW628
 Ie/mJddpFK4oLk+QkSNjSLrnxOvdTkdA58PU0i84S5eUVMNm41jJDkxg2J7vp0Zn
 sclZsclhUQ9oJ5Q2so81JMWxu4JDn7IByXL0ULBaa6xwQTiVEnyvSxSuPlflhLRW
 0vI2ylATYKyWkQqyX7VyWecZJzwhwZj5gMMWmoGsij8bkZhQ/VaQMaesByzSth+h
 NV5UAYax4GqyOQ/tg/tqT6e5nrI1zof87H64XdTCBpJ7kFyQ/oA=
 =ZwOe
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This is quite a small cycle, no major series stands out. The HNS and
  rxe drivers saw the most activity this cycle, with rxe being broken
  for a good chunk of time. The significant deleted line count is due to
  a SPDX cleanup series.

  Summary:

   - Various cleanup and small features for rtrs

   - kmap_local_page() conversions

   - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns

   - Cache the IB subnet prefix

   - Rework how CRC is calcuated in rxe

   - Clean reference counting in iwpm's netlink

   - Pull object allocation and lifecycle for user QPs to the uverbs
     core code

   - Several small hns features and continued general code cleanups

   - Fix the scatterlist confusion of orig_nents/nents introduced in an
     earlier patch creating the append operation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (90 commits)
  RDMA/mlx5: Relax DCS QP creation checks
  RDMA/hns: Delete unnecessary blank lines.
  RDMA/hns: Encapsulate the qp db as a function
  RDMA/hns: Adjust the order in which irq are requested and enabled
  RDMA/hns: Remove RST2RST error prints for hw v1
  RDMA/hns: Remove dqpn filling when modify qp from Init to Init
  RDMA/hns: Fix QP's resp incomplete assignment
  RDMA/hns: Fix query destination qpn
  RDMA/hfi1: Convert to SPDX identifier
  IB/rdmavt: Convert to SPDX identifier
  RDMA/hns: Bugfix for incorrect association between dip_idx and dgid
  RDMA/hns: Bugfix for the missing assignment for dip_idx
  RDMA/hns: Bugfix for data type of dip_idx
  RDMA/hns: Fix incorrect lsn field
  RDMA/irdma: Remove the repeated declaration
  RDMA/core/sa_query: Retry SA queries
  RDMA: Use the sg_table directly and remove the opencoded version from umem
  lib/scatterlist: Fix wrong update of orig_nents
  lib/scatterlist: Provide a dedicated function to support table append
  RDMA/hns: Delete unused hns bitmap interface
  ...
2021-09-02 14:47:21 -07:00
Thomas Hellström
669076334b drm/ttm, drm/i915: Update ttm_move_memcpy for async use
The buffer object argument to ttm_move_memcpy was only used to
determine whether the destination memory should be cleared only
or whether we should copy data. Replace it with a "clear" bool, and
update the callers.

The intention here is to be able to use ttm_move_memcpy() async under
a dma-fence as a fallback if an accelerated blit fails in a security-
critical path where data might leak if the blit is not properly
performed. For that purpose the bo is an unsuitable argument since
its relevant members might already have changed at call time.

Finally, update the ttm_move_memcpy kerneldoc that seems to have
ended up with a stale version.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210813144331.372957-3-thomas.hellstrom@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20210813144331.372957-3-thomas.hellstrom@linux.intel.com
2021-08-25 16:05:47 +02:00
Thomas Hellström
d8ac30fd47 drm/i915/ttm: Reorganize the ttm move code somewhat
In order to make the code a bit more readable and to facilitate
async memcpy moves, reorganize the move code a little. Determine
at an early stage whether to copy or to clear.

v2:
- Don't set up the memcpy iterators unless we are actually going to memcpy.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20210813144331.372957-2-thomas.hellstrom@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20210813144331.372957-2-thomas.hellstrom@linux.intel.com
2021-08-25 16:05:47 +02:00
Christian König
d5f45d1e2f drm/ttm: remove ttm_tt_destroy_common v2
Move the functionality into ttm_tt_fini and ttm_bo_tt_destroy instead.

We don't need this any more since we removed the unbind from the destroy
code paths in the drivers.

Also add a warning to ttm_tt_fini() if we try to fini a still populated TT
object.

v2: instead of reverting the patch move the functionality to different
places.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728130552.2074-5-christian.koenig@amd.com
2021-08-23 13:54:55 +02:00
Jason Ekstrand
75e382850b drm/i915/gem/ttm: Only call __i915_gem_object_set_pages if needed
__i915_ttm_get_pages does two things.  First, it calls ttm_bo_validate()
to check the given placement and migrate the BO if needed.  Then, it
updates the GEM object to match, in case the object was migrated.  If
no migration occured, however, we might still have pages on the GEM
object in which case we don't need to fetch them from TTM and call
__i915_gem_object_set_pages.  This hasn't been a problem before because
the primary user of __i915_ttm_get_pages is __i915_gem_object_get_pages
which only calls it if the GEM object doesn't have pages.

However, i915_ttm_migrate also uses __i915_ttm_get_pages to do the
migration so this meant it was unsafe to call on an already populated
object.  This patch checks i915_gem_object_has_pages() before trying to
__i915_gem_object_set_pages so i915_ttm_migrate is safe to call, even on
populated objects.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723172142.3273510-6-jason@jlekstrand.net
2021-07-26 16:37:34 +01:00
Jason Ekstrand
7d6a276e2f drm/i915: Remove allow_alloc from i915_gem_object_get_sg*
This reverts the rest of 0edbb9ba1bfe ("drm/i915: Move cmd parser
pinning to execbuffer").  Now that the only user of i915_gem_object_get_sg
without allow_alloc has been removed, we can drop the parameter.  This
portion of the revert was broken into its own patch to aid review.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-4-jason@jlekstrand.net
2021-07-16 21:47:07 +02:00
Matthew Auld
d22632c83b drm/i915: support forcing the page size with lmem
For some specialised objects we might need something larger than the
regions min_page_size due to some hw restriction, and slightly more
hairy is needing something smaller with the guarantee that such objects
will never be inserted into any GTT, which is the case for the paging
structures.

This also fixes how we setup the BO page_alignment, if we later migrate
the object somewhere else. For example if the placements are {SMEM,
LMEM}, then we might get this wrong. Pushing the min_page_size behaviour
into the manager should fix this.

v2(Thomas): push the default page size behaviour into buddy_man, and let
the user override it with the page-alignment, which looks cleaner

v3: rebase on ttm sys changes

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210625103824.558481-1-matthew.auld@intel.com
2021-06-30 13:24:29 +01:00
Thomas Hellström
b6e913e19c drm/i915/gem: Implement object migration
Introduce an interface to migrate objects between regions.
This is primarily intended to migrate objects to LMEM for display and
to SYSTEM for dma-buf, but might be reused in one form or another for
performance-based migration.

v2:
- Verify that the memory region given as an id really exists.
  (Reported by Matthew Auld)
- Call i915_gem_object_{init,release}_memory_region() when switching region
  to handle also switching region lists. (Reported by Matthew Auld)
v3:
- Fix i915_gem_object_can_migrate() to return true if object is already in
  the correct region, even if the object ops doesn't have a migrate()
  callback.
- Update typo in commit message.
- Fix kerneldoc of i915_gem_object_wait_migration().
v4:
- Improve documentation (Suggested by Mattew Auld and Michael Ruhl)
- Always assume TTM migration hits a TTM move and unsets the pages through
  move_notify. (Reported by Matthew Auld)
- Add a dma_fence_might_wait() annotation to
  i915_gem_object_wait_migration() (Suggested by Daniel Vetter)
v5:
- Re-add might_sleep() instead of __dma_fence_might_wait(), Sent
  v4 with the wrong version, didn't compile and __dma_fence_might_wait()
  is not exported.
- Added an R-B.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210629151203.209465-2-thomas.hellstrom@linux.intel.com
2021-06-30 11:32:11 +01:00
Thomas Hellström
32b7cf51a4 drm/i915/ttm: Use TTM for system memory
For discrete, use TTM for both cached and WC system memory. That means
we currently rely on the TTM memory accounting / shrinker. For cached
system memory we should consider remaining shmem-backed, which can be
implemented from our ttm_tt_populate callback. We can then also reuse our
own very elaborate shrinker for that memory.

If an object is evicted to a gem allowable region, we will now consider
the object migrated, and we flip the gem region and move the object to a
different region list. Since we are now changing gem regions, we can't
any longer rely on the CONTIGUOUS flag being set based on the region
min page size, so remove that flag update. If we want to reintroduce it,
we need to put it in the mutable flags.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624084240.270219-4-thomas.hellstrom@linux.intel.com
2021-06-24 18:51:01 +01:00
Thomas Hellström
3c2b8f326e drm/i915/ttm: Adjust gem flags and caching settings after a move
After a TTM move or object init we need to update the i915 gem flags and
caching settings to reflect the new placement. Currently caching settings
are not changed during the lifetime of an object, although that might
change moving forward if we run into performance issues or issues with
WC system page allocations.
Also introduce gpu_binds_iomem() and cpu_maps_iomem() to clean up the
various ways we previously used to detect this.
Finally, initialize the TTM object reserved to be able to update
flags and caching before anyone else gets hold of the object.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624084240.270219-3-thomas.hellstrom@linux.intel.com
2021-06-24 18:51:00 +01:00
Thomas Hellström
0ff375759f drm/i915: Update object placement flags to be mutable
The object ops i915_GEM_OBJECT_HAS_IOMEM and the object
I915_BO_ALLOC_STRUCT_PAGE flags are considered immutable by
much of our code. Introduce a new mem_flags member to hold these
and make sure checks for these flags being set are either done
under the object lock or with pages properly pinned. The flags
will change during migration under the object lock.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624084240.270219-2-thomas.hellstrom@linux.intel.com
2021-06-24 18:50:56 +01:00
Matthew Auld
4bc2d5747e drm/i915/ttm: fix static warning
warning: symbol 'i915_gem_ttm_obj_ops' was not declared. Should it be static?

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210623143411.293630-1-matthew.auld@intel.com
2021-06-24 10:00:02 +01:00
Thomas Hellström
b07a648383 drm/i915/ttm: Fix incorrect assumptions about ttm_bo_validate() semantics
We have assumed that if the current placement was not the requested
placement, but instead one of the busy placements, a TTM move would have
been triggered. That is not the case.

So when we initially place LMEM objects in "Limbo", (that is system
placement without any pages allocated), to be able to defer clearing
objects until first get_pages(), the first get_pages() would happily keep
objects in system memory if that is one of the allowed placements. And
since we don't yet support i915 GEM system memory from TTM, everything
breaks apart.

So make sure we try the requested placement first, if no eviction is
needed. If that fails, retry with all allowed placements also allowing
evictions. Also make sure we handle TTM failure codes correctly.

Also temporarily (until we support i915 GEM system on TTM), restrict
allowed placements to the requested placement to avoid things falling
apart should LMEM be full.

Fixes: 38f28c0695c0 ("drm/i915/ttm: Calculate the object placement at get_pages time")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618132515.163277-1-thomas.hellstrom@linux.intel.com
2021-06-18 17:35:16 +01:00