41 Commits

Author SHA1 Message Date
Thomas Hellström
8b1f7f92e5 drm/i915/ttm: Drop region reference counting
There is an interesting refcounting loop:
struct intel_memory_region has a struct ttm_resource_manager,
ttm_resource_manager->move may hold a reference to i915_request,
i915_request may hold a reference to intel_context,
intel_context may hold a reference to drm_i915_gem_object,
drm_i915_gem_object may hold a reference to intel_memory_region.

Break this loop by dropping region reference counting.

In addition, Have regions with a manager moving fence make sure
that all region objects are released before freeing the region.

v6:
- Fix a code comment.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-4-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:16 +01:00
Thomas Hellström
05d1c76107 drm/i915/ttm: Move the i915_gem_obj_copy_ttm() function
Move the i915_gem_obj_copy_ttm() function to i915_gem_ttm_move.h.
This will help keep a number of functions static when introducing
async moves.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122214554.371864-3-thomas.hellstrom@linux.intel.com
2021-11-25 09:36:15 +01:00
Matthew Auld
be373fad54 drm/i915/ttm: fixup build failure
drm-intel-gt-next fails to build with:

drivers/gpu/drm/i915/gem/i915_gem_ttm.c: In function ‘vm_fault_ttm’:
drivers/gpu/drm/i915/gem/i915_gem_ttm.c:862:23: error: too many arguments to function ‘ttm_bo_vm_fault_reserved’
  862 |                 ret = ttm_bo_vm_fault_reserved(vmf, vmf->vma->vm_page_prot,
      |                       ^~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211123125814.1703220-1-matthew.auld@intel.com
2021-11-23 16:36:22 +00:00
Dan Carpenter
6164807dd2 drm/i915/ttm: Fix error code in i915_ttm_eviction_valuable()
This function returns a bool type so returning -EBUSY is equivalent to
returning true.  It should return false instead.

Fixes: 7ae034590cea ("drm/i915/ttm: add tt shmem backend")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211122061438.GA2492@kili
2021-11-22 14:20:44 +00:00
Thomas Hellström
d3cb30f8dc drm/i915/ttm: Fix illegal addition to shrinker list
There's a small window of opportunity during which the adjust_lru()
function can be called with a GEM refcount of zero from the TTM
eviction code. This results in a kernel BUG().

Ensure that we don't attempt to modify the GEM shrinker lists unless
we have a GEM refcount.

Fixes: ebd4a8ec7799 ("drm/i915/ttm: move shrinker management into adjust_lru")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211110085527.1033475-1-thomas.hellstrom@linux.intel.com
2021-11-11 09:00:00 +01:00
Thomas Hellström
3589fdbd3b drm/i915/ttm: Reorganize the ttm move code
We are about to introduce failsafe- and asynchronous migration and
ttm moves.
This will add complexity and code to the TTM move code so it makes sense
to split it out to a separate file to make the i915 TTM code easer to
digest.
Split the i915 TTM move code out and since we will have to change the name
of the gpu_binds_iomem() and cpu_maps_iomem() functions anyway,
we alter the name of gpu_binds_iomem() to i915_ttm_gtt_binds_lmem() which
is more reflecting what it is used for.
With this we also add some more documentation. Otherwise there should be
no functional change.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211104110718.688420-2-thomas.hellstrom@linux.intel.com
2021-11-05 09:05:30 +01:00
Thomas Hellström
cad7109a2b drm/i915: Introduce refcounted sg-tables
As we start to introduce asynchronous failsafe object migration,
where we update the object state and then submit asynchronous
commands we need to record what memory resources are actually used
by various part of the command stream. Initially for three purposes:

1) Error capture.
2) Asynchronous migration error recovery.
3) Asynchronous vma bind.

At the time where these happens, the object state may have been updated
to be several migrations ahead and object sg-tables discarded.

In order to make it possible to keep sg-tables with memory resource
information for these operations, introduce refcounted sg-tables that
aren't freed until the last user is done with them.

The alternative would be to reference information sitting on the
corresponding ttm_resources which typically have the same lifetime as
these refcountes sg_tables, but that leads to other awkward constructs:
Due to the design direction chosen for ttm resource managers that would
lead to diamond-style inheritance, the LMEM resources may sometimes be
prematurely freed, and finally the subclassed struct ttm_resource would
have to bleed into the asynchronous vma bind code.

v3:
- Address a number of style issues (Matthew Auld)
v4:
- Dont check for st->sgl being NULL in i915_ttm_tt__shmem_unpopulate(),
  that should never happen. (Matthew Auld)
v5:
- Fix a Potential double-free (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211101122444.114607-1-thomas.hellstrom@linux.intel.com
2021-11-01 18:10:49 +01:00
Matthew Auld
5d12ffe6be drm/i915/ttm: enable shmem tt backend
Turn on the shmem tt backend, and enable shrinking.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-8-matthew.auld@intel.com
2021-10-22 13:19:30 +01:00
Matthew Auld
2eda4fc6d0 drm/i915/ttm: use cached system pages when evicting lmem
This should let us do an accelerated copy directly to the shmem pages
when temporarily moving lmem-only objects, where the i915-gem shrinker
can later kick in to swap out the pages, if needed.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-7-matthew.auld@intel.com
2021-10-22 13:19:29 +01:00
Matthew Auld
ebd4a8ec77 drm/i915/ttm: move shrinker management into adjust_lru
We currently just evict lmem objects to system memory when under memory
pressure. For this case we might lack the usual object mm.pages, which
effectively hides the pages from the i915-gem shrinker, until we
actually "attach" the TT to the object, or in the case of lmem-only
objects it just gets migrated back to lmem when touched again.

For all cases we can just adjust the i915 shrinker LRU each time we also
adjust the TTM LRU. The two cases we care about are:

  1) When something is moved by TTM, including when initially populating
     an object. Importantly this covers the case where TTM moves something from
     lmem <-> smem, outside of the normal get_pages() interface, which
     should still ensure the shmem pages underneath are reclaimable.

  2) When calling into i915_gem_object_unlock(). The unlock should
     ensure the object is removed from the shinker LRU, if it was indeed
     swapped out, or just purged, when the shrinker drops the object lock.

v2(Thomas):
  - Handle managing the shrinker LRU in adjust_lru, where it is always
    safe to touch the object.
v3(Thomas):
  - Pretty much a re-write. This time piggy back off the shrink_pin
    stuff, which actually seems to fit quite well for what we want here.
v4(Thomas):
  - Just use a simple boolean for tracking ttm_shrinkable.
v5:
  - Ensure we call adjust_lru when faulting the object, to ensure the
    pages are visible to the shrinker, if needed.
  - Add back the adjust_lru when in i915_ttm_move (Thomas)
v6(Reported-by: kernel test robot <lkp@intel.com>):
  - Remove unused i915_tt

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #v4
Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-6-matthew.auld@intel.com
2021-10-22 13:19:26 +01:00
Matthew Auld
7ae034590c drm/i915/ttm: add tt shmem backend
For cached objects we can allocate our pages directly in shmem. This
should make it possible(in a later patch) to utilise the existing
i915-gem shrinker code for such objects. For now this is still disabled.

v2(Thomas):
  - Add optional try_to_writeback hook for objects. Importantly we need
    to check if the object is even still shrinkable; in between us
    dropping the shrinker LRU lock and acquiring the object lock it could for
    example have been moved. Also we need to differentiate between
    "lazy" shrinking and the immediate writeback mode. Also later we need to
    handle objects which don't even have mm.pages, so bundling this into
    put_pages() would require somehow handling that edge case, hence
    just letting the ttm backend handle everything in try_to_writeback
    doesn't seem too bad.
v3(Thomas):
  - Likely a bad idea to touch the object from the unpopulate hook,
    since it's not possible to hold a reference, without also creating
    circular dependency, so likely this is too fragile. For now just
    ensure we at least mark the pages as dirty/accessed when called from the
    shrinker on WILLNEED objects.
  - s/try_to_writeback/shrinker_release_pages, since this can do more
    than just writeback.
  - Get rid of do_backup boolean and just set the SWAPPED flag prior to
    calling unpopulate.
  - Keep shmem_tt as lowest priority for the TTM LRU bo_swapout walk, since
    these just get skipped anyway. We can try to come up with something
    better later.
v4(Thomas):
  - s/PCI_DMA/DMA/. Also drop NO_KERNEL_MAPPING and NO_WARN, which
    apparently doesn't do anything with streaming mappings.
  - Just pass along the error for ->truncate, and assume nothing.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Oak Zeng <oak.zeng@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Oak Zeng <oak.zeng@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211018091055.1998191-2-matthew.auld@intel.com
2021-10-22 13:19:20 +01:00
Dave Airlie
1176d15f0f Merge tag 'drm-intel-gt-next-2021-10-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:

- Add uAPI for using PXP protected objects

  Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8064

- Add PCI IDs and LMEM discovery/placement uAPI for DG1

  Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584

- Disable engine bonding on Gen12+ except TGL, RKL and ADL-S

Cross-subsystem Changes:

- Merges 'tip/locking/wwmutex' branch (core kernel tip)
- "mei: pxp: export pavp client to me client bus"

Core Changes:

- Update ttm_move_memcpy for async use (Thomas)

Driver Changes:

- Enable GuC submission by default on DG1 (Matt B)
- Add PXP (Protected Xe Path) support for Gen12 integrated (Daniele,
  Sean, Anshuman)
  See "drm/i915/pxp: add PXP documentation" for details!
- Remove force_probe protection for ADL-S (Raviteja)
- Add base support for XeHP/XeHP SDV (Matt R, Stuart, Lucas)
- Handle DRI_PRIME=1 on Intel igfx + Intel dgfx hybrid graphics setup (Tvrtko)
- Use Transparent Hugepages when IOMMU is enabled (Tvrtko, Chris)
- Implement LMEM backup and restore for suspend / resume (Thomas)
- Report INSTDONE_GEOM values in error state for DG2 (Matt R)
- Add DG2-specific shadow register table (Matt R)
- Update Gen11/Gen12/XeHP shadow register tables (Matt R)
- Maintain backward-compatible nested batch behavior on TGL+ (Matt R)
- Add new LRI reg offsets for DG2 (Akeem)
- Initialize unused MOCS entries to device specific values (Ayaz)
- Track and use the correct UC MOCS index on Gen12 (Ayaz)
- Add separate MOCS table for Gen12 devices other than TGL/RKL (Ayaz)
- Simplify the locking and eliminate some RCU usage (Daniel)
- Add some flushing for the 64K GTT path (Matt A)
- Mark GPU wedging on driver unregister unrecoverable (Janusz)

- Major rework in the GuC codebase, simplify locking and add docs (Matt B)
- Add DG1 GuC/HuC firmwares (Daniele, Matt B)
- Remember to call i915_sw_fence_fini on guc_state.blocked (Matt A)
- Use "gt" forcewake domain name for error messages instead of "blitter" (Matt R)
- Drop now duplicate LMEM uAPI RFC kerneldoc section (Daniel)
- Fix early tracepoints for requests (Matt A)
- Use locked access to ctx->engines in set_priority (Daniel)
- Convert gen6/gen7/gen8 read operations to fwtable (Matt R)
- Drop gen11/gen12 specific mmio write handlers (Matt R)
- Drop gen11 specific mmio read handlers (Matt R)
- Use designated initializers for init/exit table (Kees)
- Fix syncmap memory leak (Matt B)
- Add pretty printing for buddy allocator state debug (Matt A)
- Fix potential error pointer dereference in pinned_context() (Dan)
- Remove IS_ACTIVE macro (Lucas)
- Static code checker fixes (Nathan)
- Clean up disabled warnings (Nathan)
- Increase timeout in i915_gem_contexts selftests 5x for GuC submission (Matt B)
- Ensure wa_init_finish() is called for ctx workaround list (Matt R)
- Initialize L3CC table in mocs init (Sreedhar, Ayaz, Ram)
- Get PM ref before accessing HW register (Vinay)
- Move __i915_gem_free_object to ttm_bo_destroy (Maarten)
- Deduplicate frequency dump on debugfs (Lucas)
- Make wa list per-gt (Venkata)
- Do not define dummy vma in stack (Venkata)
- Take pinning into account in __i915_gem_object_is_lmem (Matt B, Thomas)
- Do not report currently active engine when describing objects (Tvrtko)
- Fix pdfdocs build error by removing nested grid from GuC docs (Akira)
- Remove false warning from the rps worker (Tejas)
- Flush buffer pools on driver remove (Janusz)
- Fix runtime pm handling in i915_gem_shrink (Maarten)
- Rework TTM object initialization slightly (Thomas)
- Use fixed offset for PTEs location (Michal Wa)
- Verify result from CTB (de)register action and improve error messages (Michal Wa)
- Fix bug in user proto-context creation that leaked contexts (Matt B)

- Re-use Gen11 forcewake read functions on Gen12 (Matt R)
- Make shadow tables range-based (Matt R)
- Ditch the i915_gem_ww_ctx loop member (Thomas, Maarten)
- Use NULL instead of 0 where appropriate (Ville)
- Rename pci/debugfs functions to respect file prefix (Jani, Lucas)
- Drop guc_communication_enabled (Daniele)
- Selftest fixes (Thomas, Daniel, Matt A, Maarten)
- Clean up inconsistent indenting (Colin)
- Use direction definition DMA_BIDIRECTIONAL instead of
  PCI_DMA_BIDIRECTIONAL (Cai)
- Add "intel_" as prefix in set_mocs_index() (Ayaz)

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YWAO80MB2eyToYoy@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
2021-10-11 18:09:39 +10:00
Thomas Hellström
068396bb21 drm/i915/ttm: Rework object initialization slightly
We may end up in i915_ttm_bo_destroy() in an error path before the
object is fully initialized. In that case it's not correct to call
__i915_gem_free_object(), because that function
a) Assumes the gem object refcount is 0, which it isn't.
b) frees the placements which are owned by the caller until the
init_object() region ops returns successfully. Fix this by providing
a lightweight cleanup function __i915_gem_object_fini() which is also
called by __i915_gem_free_object().

While doing this, also make sure we call dma_resv_fini() as part of
ordinary object destruction and not from the RCU callback that frees
the object. This will help track down bugs where the object is incorrectly
locked from an RCU lookup.

Finally, make sure the object isn't put on the region list until it's
either locked or fully initialized in order to block list processing of
partially initialized objects.

v2:
- The TTM object backend memory was freed before the gem pages were
  put. Separate this functionality into __i915_gem_object_pages_fini()
  and call it from the TTM delete_mem_notify() callback.
v3:
- Include i915_gem_object_free_mmaps() in __i915_gem_object_pages_fini()
  to make sure we don't inadvertedly introduce a race.

Fixes: 48b096126954 ("drm/i915: Move __i915_gem_free_object to ttm_bo_destroy")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com> #v1
Link: https://patchwork.freedesktop.org/patch/msgid/20210930113236.583531-1-thomas.hellstrom@linux.intel.com
2021-10-01 13:11:58 +02:00
Matthew Auld
43d46f0b78 drm/ttm: s/FLAG_SG/FLAG_EXTERNAL/
It covers more than just ttm_bo_type_sg usage, like with say dma-buf,
since one other user is userptr in amdgpu, and in the future we might
have some more. Hence EXTERNAL is likely a more suitable name.

v2(Christian):
  - Rename these to TTM_TT_FLAGS_*
  - Fix up all the holes in the flag values

Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Christian König <christian.koenig@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210929132629.353541-1-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
2021-09-29 16:17:56 +02:00
Thomas Hellström
c56ce95653 drm/i915 Implement LMEM backup and restore for suspend / resume
Just evict unpinned objects to system. For pinned LMEM objects,
make a backup system object and blit the contents to that.

Backup is performed in three steps,
1: Opportunistically evict evictable objects using the gpu blitter.
2: After gt idle, evict evictable objects using the gpu blitter. This will
be modified in an upcoming patch to backup pinned objects that are not used
by the blitter itself.
3: Backup remaining pinned objects using memcpy.

Also move uC suspend to after 2) to make sure we have a functional GuC
during 2) if using GuC submission.

v2:
- Major refactor to make sure gem_exec_suspend@hang-SX subtests work, and
  suspend / resume works with a slightly modified GuC submission enabling
  patch series.

v3:
- Fix a potential use-after-free (Matthew Auld)
- Use i915_gem_object_create_shmem() instead of
  i915_gem_object_create_region (Matthew Auld)
- Minor simplifications (Matthew Auld)
- Fix up kerneldoc for i195_ttm_restore_region().
- Final lmem_suspend() call moved to i915_gem_backup_suspend from
  i915_gem_suspend_late, since the latter gets called at driver unload
  and we don't unnecessarily want to run it at that time.

v4:
- Interface change of ttm- & lmem suspend / resume functions to use
  flags rather than bools. (Matthew Auld)
- Completely drop the i915_gem_backup_suspend change (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-5-thomas.hellstrom@linux.intel.com
2021-09-24 08:19:11 +02:00
Thomas Hellström
0d9388635a drm/i915/ttm: Implement a function to copy the contents of two TTM-based objects
When backing up or restoring contents of pinned objects at suspend /
resume time we need to allocate a new object as the backup. Add a function
to facilitate copies between the two. Some data needs to be copied before
the migration context is ready for operation, so make sure we can
disable accelerated copies.

v2:
- Fix a missing return value check (Matthew Auld)

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-2-thomas.hellstrom@linux.intel.com
2021-09-24 08:19:09 +02:00
Maarten Lankhorst
48b0961269 drm/i915: Move __i915_gem_free_object to ttm_bo_destroy
When we implement delayed destroy, we may have a second
call to the delete_mem_notify() handler, while free_object()
only should be called once.

Move it to bo->destroy(), to ensure it's only called once.
This fixes some weird memory corruption issues with delayed
destroy when async eviction is used.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210830121006.2978297-2-maarten.lankhorst@linux.intel.com
Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
2021-09-16 15:30:27 +02:00
Joonas Lahtinen
d5dd580deb Merge drm/drm-next into drm-intel-gt-next
Close the divergence which has caused patches not to apply and
have a solid baseline for the PXP patches that Rodrigo will send
a topic branch PR for.

Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
2021-09-15 13:23:27 +03:00
Maxime Ripard
2f76520561
Merge drm/drm-next into drm-misc-next
Kickstart new drm-misc-next cycle.

Signed-off-by: Maxime Ripard <maxime@cerno.tech>
2021-09-14 09:25:30 +02:00
Linus Torvalds
23852bec53 RDMA v5.15 merge window Pull Request
- Various cleanup and small features for rtrs
 
 - kmap_local_page() conversions
 
 - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns
 
 - Cache the IB subnet prefix
 
 - Rework how CRC is calcuated in rxe
 
 - Clean reference counting in iwpm's netlink
 
 - Pull object allocation and lifecycle for user QPs to the uverbs core
   code
 
 - Several small hns features and continued general code cleanups
 
 - Fix the scatterlist confusion of orig_nents/nents introduced in an
   earlier patch creating the append operation
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmEudRgACgkQOG33FX4g
 mxraJA//c6bMxrrTVrzmrtrkyYD4tYWE8RDfgvoyZtleZnnEOJeunCQWakQrpJSv
 ukSnOGCA3PtnmRMdV54f/11YJ/7otxOJodSO7jWsIoBrqG/lISAdX8mn2iHhrvJ0
 dIaFEFPLy0WqoMLCJVIYIupR0IStVHb/mWx0uYL4XnnoYKyt7f7K5JMZpNWMhDN2
 ieJw0jfrvEYm8pipWuxUvB16XARlzAWQrjqLpMRI+jFRpbDVBY21dz2/LJvOJPrA
 LcQ+XXsV/F659ibOAGm6bU4BMda8fE6Lw90B/gmhSswJ205NrdziF5cNYHP0QxcN
 oMjrjSWWHc9GEE7MTipC2AH8e36qob16Q7CK+zHEJ+ds7R6/O/8XmED1L8/KFpNA
 FGqnjxnxsl1y27mUegfj1Hh8PfoDp2oVq0lmpEw0CYo4cfVzHSMRrbTR//XmW628
 Ie/mJddpFK4oLk+QkSNjSLrnxOvdTkdA58PU0i84S5eUVMNm41jJDkxg2J7vp0Zn
 sclZsclhUQ9oJ5Q2so81JMWxu4JDn7IByXL0ULBaa6xwQTiVEnyvSxSuPlflhLRW
 0vI2ylATYKyWkQqyX7VyWecZJzwhwZj5gMMWmoGsij8bkZhQ/VaQMaesByzSth+h
 NV5UAYax4GqyOQ/tg/tqT6e5nrI1zof87H64XdTCBpJ7kFyQ/oA=
 =ZwOe
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This is quite a small cycle, no major series stands out. The HNS and
  rxe drivers saw the most activity this cycle, with rxe being broken
  for a good chunk of time. The significant deleted line count is due to
  a SPDX cleanup series.

  Summary:

   - Various cleanup and small features for rtrs

   - kmap_local_page() conversions

   - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns

   - Cache the IB subnet prefix

   - Rework how CRC is calcuated in rxe

   - Clean reference counting in iwpm's netlink

   - Pull object allocation and lifecycle for user QPs to the uverbs
     core code

   - Several small hns features and continued general code cleanups

   - Fix the scatterlist confusion of orig_nents/nents introduced in an
     earlier patch creating the append operation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (90 commits)
  RDMA/mlx5: Relax DCS QP creation checks
  RDMA/hns: Delete unnecessary blank lines.
  RDMA/hns: Encapsulate the qp db as a function
  RDMA/hns: Adjust the order in which irq are requested and enabled
  RDMA/hns: Remove RST2RST error prints for hw v1
  RDMA/hns: Remove dqpn filling when modify qp from Init to Init
  RDMA/hns: Fix QP's resp incomplete assignment
  RDMA/hns: Fix query destination qpn
  RDMA/hfi1: Convert to SPDX identifier
  IB/rdmavt: Convert to SPDX identifier
  RDMA/hns: Bugfix for incorrect association between dip_idx and dgid
  RDMA/hns: Bugfix for the missing assignment for dip_idx
  RDMA/hns: Bugfix for data type of dip_idx
  RDMA/hns: Fix incorrect lsn field
  RDMA/irdma: Remove the repeated declaration
  RDMA/core/sa_query: Retry SA queries
  RDMA: Use the sg_table directly and remove the opencoded version from umem
  lib/scatterlist: Fix wrong update of orig_nents
  lib/scatterlist: Provide a dedicated function to support table append
  RDMA/hns: Delete unused hns bitmap interface
  ...
2021-09-02 14:47:21 -07:00
Thomas Hellström
669076334b drm/ttm, drm/i915: Update ttm_move_memcpy for async use
The buffer object argument to ttm_move_memcpy was only used to
determine whether the destination memory should be cleared only
or whether we should copy data. Replace it with a "clear" bool, and
update the callers.

The intention here is to be able to use ttm_move_memcpy() async under
a dma-fence as a fallback if an accelerated blit fails in a security-
critical path where data might leak if the blit is not properly
performed. For that purpose the bo is an unsuitable argument since
its relevant members might already have changed at call time.

Finally, update the ttm_move_memcpy kerneldoc that seems to have
ended up with a stale version.

Cc: Christian König <christian.koenig@amd.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Link: https://lore.kernel.org/r/20210813144331.372957-3-thomas.hellstrom@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20210813144331.372957-3-thomas.hellstrom@linux.intel.com
2021-08-25 16:05:47 +02:00
Thomas Hellström
d8ac30fd47 drm/i915/ttm: Reorganize the ttm move code somewhat
In order to make the code a bit more readable and to facilitate
async memcpy moves, reorganize the move code a little. Determine
at an early stage whether to copy or to clear.

v2:
- Don't set up the memcpy iterators unless we are actually going to memcpy.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://lore.kernel.org/r/20210813144331.372957-2-thomas.hellstrom@linux.intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20210813144331.372957-2-thomas.hellstrom@linux.intel.com
2021-08-25 16:05:47 +02:00
Christian König
d5f45d1e2f drm/ttm: remove ttm_tt_destroy_common v2
Move the functionality into ttm_tt_fini and ttm_bo_tt_destroy instead.

We don't need this any more since we removed the unbind from the destroy
code paths in the drivers.

Also add a warning to ttm_tt_fini() if we try to fini a still populated TT
object.

v2: instead of reverting the patch move the functionality to different
places.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210728130552.2074-5-christian.koenig@amd.com
2021-08-23 13:54:55 +02:00
Jason Ekstrand
75e382850b drm/i915/gem/ttm: Only call __i915_gem_object_set_pages if needed
__i915_ttm_get_pages does two things.  First, it calls ttm_bo_validate()
to check the given placement and migrate the BO if needed.  Then, it
updates the GEM object to match, in case the object was migrated.  If
no migration occured, however, we might still have pages on the GEM
object in which case we don't need to fetch them from TTM and call
__i915_gem_object_set_pages.  This hasn't been a problem before because
the primary user of __i915_ttm_get_pages is __i915_gem_object_get_pages
which only calls it if the GEM object doesn't have pages.

However, i915_ttm_migrate also uses __i915_ttm_get_pages to do the
migration so this meant it was unsafe to call on an already populated
object.  This patch checks i915_gem_object_has_pages() before trying to
__i915_gem_object_set_pages so i915_ttm_migrate is safe to call, even on
populated objects.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210723172142.3273510-6-jason@jlekstrand.net
2021-07-26 16:37:34 +01:00
Jason Ekstrand
7d6a276e2f drm/i915: Remove allow_alloc from i915_gem_object_get_sg*
This reverts the rest of 0edbb9ba1bfe ("drm/i915: Move cmd parser
pinning to execbuffer").  Now that the only user of i915_gem_object_get_sg
without allow_alloc has been removed, we can drop the parameter.  This
portion of the revert was broken into its own patch to aid review.

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Jon Bloomfield <jon.bloomfield@intel.com>
Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-4-jason@jlekstrand.net
2021-07-16 21:47:07 +02:00
Matthew Auld
d22632c83b drm/i915: support forcing the page size with lmem
For some specialised objects we might need something larger than the
regions min_page_size due to some hw restriction, and slightly more
hairy is needing something smaller with the guarantee that such objects
will never be inserted into any GTT, which is the case for the paging
structures.

This also fixes how we setup the BO page_alignment, if we later migrate
the object somewhere else. For example if the placements are {SMEM,
LMEM}, then we might get this wrong. Pushing the min_page_size behaviour
into the manager should fix this.

v2(Thomas): push the default page size behaviour into buddy_man, and let
the user override it with the page-alignment, which looks cleaner

v3: rebase on ttm sys changes

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210625103824.558481-1-matthew.auld@intel.com
2021-06-30 13:24:29 +01:00
Thomas Hellström
b6e913e19c drm/i915/gem: Implement object migration
Introduce an interface to migrate objects between regions.
This is primarily intended to migrate objects to LMEM for display and
to SYSTEM for dma-buf, but might be reused in one form or another for
performance-based migration.

v2:
- Verify that the memory region given as an id really exists.
  (Reported by Matthew Auld)
- Call i915_gem_object_{init,release}_memory_region() when switching region
  to handle also switching region lists. (Reported by Matthew Auld)
v3:
- Fix i915_gem_object_can_migrate() to return true if object is already in
  the correct region, even if the object ops doesn't have a migrate()
  callback.
- Update typo in commit message.
- Fix kerneldoc of i915_gem_object_wait_migration().
v4:
- Improve documentation (Suggested by Mattew Auld and Michael Ruhl)
- Always assume TTM migration hits a TTM move and unsets the pages through
  move_notify. (Reported by Matthew Auld)
- Add a dma_fence_might_wait() annotation to
  i915_gem_object_wait_migration() (Suggested by Daniel Vetter)
v5:
- Re-add might_sleep() instead of __dma_fence_might_wait(), Sent
  v4 with the wrong version, didn't compile and __dma_fence_might_wait()
  is not exported.
- Added an R-B.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210629151203.209465-2-thomas.hellstrom@linux.intel.com
2021-06-30 11:32:11 +01:00
Thomas Hellström
32b7cf51a4 drm/i915/ttm: Use TTM for system memory
For discrete, use TTM for both cached and WC system memory. That means
we currently rely on the TTM memory accounting / shrinker. For cached
system memory we should consider remaining shmem-backed, which can be
implemented from our ttm_tt_populate callback. We can then also reuse our
own very elaborate shrinker for that memory.

If an object is evicted to a gem allowable region, we will now consider
the object migrated, and we flip the gem region and move the object to a
different region list. Since we are now changing gem regions, we can't
any longer rely on the CONTIGUOUS flag being set based on the region
min page size, so remove that flag update. If we want to reintroduce it,
we need to put it in the mutable flags.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624084240.270219-4-thomas.hellstrom@linux.intel.com
2021-06-24 18:51:01 +01:00
Thomas Hellström
3c2b8f326e drm/i915/ttm: Adjust gem flags and caching settings after a move
After a TTM move or object init we need to update the i915 gem flags and
caching settings to reflect the new placement. Currently caching settings
are not changed during the lifetime of an object, although that might
change moving forward if we run into performance issues or issues with
WC system page allocations.
Also introduce gpu_binds_iomem() and cpu_maps_iomem() to clean up the
various ways we previously used to detect this.
Finally, initialize the TTM object reserved to be able to update
flags and caching before anyone else gets hold of the object.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624084240.270219-3-thomas.hellstrom@linux.intel.com
2021-06-24 18:51:00 +01:00
Thomas Hellström
0ff375759f drm/i915: Update object placement flags to be mutable
The object ops i915_GEM_OBJECT_HAS_IOMEM and the object
I915_BO_ALLOC_STRUCT_PAGE flags are considered immutable by
much of our code. Introduce a new mem_flags member to hold these
and make sure checks for these flags being set are either done
under the object lock or with pages properly pinned. The flags
will change during migration under the object lock.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210624084240.270219-2-thomas.hellstrom@linux.intel.com
2021-06-24 18:50:56 +01:00
Matthew Auld
4bc2d5747e drm/i915/ttm: fix static warning
warning: symbol 'i915_gem_ttm_obj_ops' was not declared. Should it be static?

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210623143411.293630-1-matthew.auld@intel.com
2021-06-24 10:00:02 +01:00
Thomas Hellström
b07a648383 drm/i915/ttm: Fix incorrect assumptions about ttm_bo_validate() semantics
We have assumed that if the current placement was not the requested
placement, but instead one of the busy placements, a TTM move would have
been triggered. That is not the case.

So when we initially place LMEM objects in "Limbo", (that is system
placement without any pages allocated), to be able to defer clearing
objects until first get_pages(), the first get_pages() would happily keep
objects in system memory if that is one of the allowed placements. And
since we don't yet support i915 GEM system memory from TTM, everything
breaks apart.

So make sure we try the requested placement first, if no eviction is
needed. If that fails, retry with all allowed placements also allowing
evictions. Also make sure we handle TTM failure codes correctly.

Also temporarily (until we support i915 GEM system on TTM), restrict
allowed placements to the requested placement to avoid things falling
apart should LMEM be full.

Fixes: 38f28c0695c0 ("drm/i915/ttm: Calculate the object placement at get_pages time")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210618132515.163277-1-thomas.hellstrom@linux.intel.com
2021-06-18 17:35:16 +01:00
Ramalingam C
50331a7b50 drm/i915/ttm: accelerated move implementation
Invokes the pipelined page migration through blt, for
i915_ttm_move requests of eviction and also obj clear.

Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210617063018.92802-11-thomas.hellstrom@linux.intel.com
2021-06-17 14:23:12 +01:00
Matthew Auld
13c2ceb6ad drm/i915/ttm: restore min_page_size behaviour
We now have bo->page_alignment which perfectly describes what we need if
we have min page size restrictions for lmem. We can also drop the flag
here, since this is the default behaviour for all objects.

v2(Thomas):
    - bo->page_alignment is in page units

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-7-matthew.auld@intel.com
2021-06-16 16:48:02 +01:00
Matthew Auld
d53ec322dc drm/i915/ttm: switch over to ttm_buddy_man
Move back to the buddy allocator for managing device local memory, and
restore the lost mock selftests. Keep around the range manager related
bits, since we likely need this for managing stolen at some point. For
stolen we also don't need to reserve anything so no need to support a
generic reserve interface.

v2(Thomas):
    - bo->page_alignment is in page units, not bytes

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-6-matthew.auld@intel.com
2021-06-16 16:48:02 +01:00
Matthew Auld
687c7d0fcf drm/i915/ttm: remove node usage in our naming
Now that ttm_resource_manager just returns a generic ttm_resource we
don't need to reference the mm_node stuff anymore which mostly only
makes sense for drm_mm_node. In the next few patches we want switch over
to the ttm_buddy_man which is just another type of ttm_resource so
reflect that in the naming.

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-5-matthew.auld@intel.com
2021-06-16 16:48:02 +01:00
Matthew Auld
beb6a22911 drm/i915/ttm: pass along the I915_BO_ALLOC_CONTIGUOUS
Currently we just ignore the I915_BO_ALLOC_CONTIGUOUS flag, which is
fine since everything is already contiguous with the ttm range manager.
However in the next patch we want to switch over to the ttm buddy
manager, where allocations are by default not contiguous.

v2(Thomas):
    - Forward ALLOC_CONTIG for all regions

Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-4-matthew.auld@intel.com
2021-06-16 16:48:02 +01:00
Thomas Hellström
38f28c0695 drm/i915/ttm: Calculate the object placement at get_pages time
Instead of relying on a static placement, calculate at get_pages() time.
This should work for LMEM regions and system for now. For stolen we need
to take preallocated range into account. That will if needed be added
later.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210616152501.394518-3-matthew.auld@intel.com
2021-06-16 16:47:54 +01:00
Thomas Hellström
c865204e84 drm/i915/ttm: Fix memory leaks
Fix two memory leaks introduced with the ttm backend.

Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210615122408.32347-1-thomas.hellstrom@linux.intel.com
2021-06-16 15:57:52 +01:00
Maarten Lankhorst
cf3e3e86d7 drm/i915: Use ttm mmap handling for ttm bo's.
Use the ttm handlers for servicing page faults, and vm_access.

We do our own validation of read-only access, otherwise use the
ttm handlers as much as possible.

Because the ttm handlers expect the vma_node at vma->base, we slightly
need to massage the mmap handlers to look at vma_node->driver_private
to fetch the bo, if it's NULL, we assume i915's normal mmap_offset uapi
is used.

This is the easiest way to achieve compatibility without changing ttm's
semantics.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210610070152.572423-5-thomas.hellstrom@linux.intel.com
2021-06-11 10:53:25 +02:00
Thomas Hellström
213d509277 drm/i915/ttm: Introduce a TTM i915 gem object backend
Most logical place to introduce TTM buffer objects is as an i915
gem object backend. We need to add some ops to account for added
functionality like delayed delete and LRU list manipulation.

Initially we support only LMEM and SYSTEM memory, but SYSTEM
(which in this case means evicted LMEM objects) is not
visible to i915 GEM yet. The plan is to move the i915 gem system region
over to the TTM system memory type in upcoming patches.

We set up GPU bindings directly both from LMEM and from the system region,
as there is no need to use the legacy TTM_TT memory type. We reserve
that for future porting of GGTT bindings to TTM.

Remove the old lmem backend.

Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210610070152.572423-2-thomas.hellstrom@linux.intel.com
2021-06-11 10:53:06 +02:00