2bc9c04ea7
Add an entry for the new uAPI needed for DG1. Also add the overall upstream plan, including some notes for the TTM conversion. v2(Daniel): - include the overall upstreaming plan - add a note for mmap, there are differences here for TTM vs i915 - bunch of other suggestions from Daniel v3: (Daniel) - add a note for set/get caching stuff - add some more docs for existing query and extensions stuff - add an actual code example for regions query - bunch of other stuff (Jason) - uAPI change(!): - try a simpler design with the placements extension - rather than have a generic setparam which can cover multiple use cases, have each extension be responsible for one thing only v4: (Daniel) - add some more notes for ttm conversion - bunch of other stuff (Jason) - uAPI change(!): - drop all the extra rsvd members for the region_query and region_info, just keep the bare minimum needed for padding v5: (Jason) - for the upstream plan, add a requirement that we send the uAPI bits again for final sign off before turning it on for real - document how we intend to extend the rsvd bits for the region query (Kenneth) - improve the comment for the smem+lmem mmap mode and caching Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com> Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: Lionel Landwerlin <lionel.g.landwerlin@linux.intel.com> Cc: Jon Bloomfield <jon.bloomfield@intel.com> Cc: Jordan Justen <jordan.l.justen@intel.com> Cc: Daniel Vetter <daniel.vetter@intel.com> Cc: Kenneth Graunke <kenneth@whitecape.org> Cc: Jason Ekstrand <jason@jlekstrand.net> Cc: Dave Airlie <airlied@gmail.com> Cc: dri-devel@lists.freedesktop.org Cc: mesa-dev@lists.freedesktop.org Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: Dave Airlie <airlied@redhat.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Acked-by: Jon Bloomfield <jon.bloomfield@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210429103056.407067-1-matthew.auld@intel.com
132 lines
5.8 KiB
ReStructuredText
132 lines
5.8 KiB
ReStructuredText
=========================
|
|
I915 DG1/LMEM RFC Section
|
|
=========================
|
|
|
|
Upstream plan
|
|
=============
|
|
For upstream the overall plan for landing all the DG1 stuff and turning it for
|
|
real, with all the uAPI bits is:
|
|
|
|
* Merge basic HW enabling of DG1(still without pciid)
|
|
* Merge the uAPI bits behind special CONFIG_BROKEN(or so) flag
|
|
* At this point we can still make changes, but importantly this lets us
|
|
start running IGTs which can utilize local-memory in CI
|
|
* Convert over to TTM, make sure it all keeps working. Some of the work items:
|
|
* TTM shrinker for discrete
|
|
* dma_resv_lockitem for full dma_resv_lock, i.e not just trylock
|
|
* Use TTM CPU pagefault handler
|
|
* Route shmem backend over to TTM SYSTEM for discrete
|
|
* TTM purgeable object support
|
|
* Move i915 buddy allocator over to TTM
|
|
* MMAP ioctl mode(see `I915 MMAP`_)
|
|
* SET/GET ioctl caching(see `I915 SET/GET CACHING`_)
|
|
* Send RFC(with mesa-dev on cc) for final sign off on the uAPI
|
|
* Add pciid for DG1 and turn on uAPI for real
|
|
|
|
New object placement and region query uAPI
|
|
==========================================
|
|
Starting from DG1 we need to give userspace the ability to allocate buffers from
|
|
device local-memory. Currently the driver supports gem_create, which can place
|
|
buffers in system memory via shmem, and the usual assortment of other
|
|
interfaces, like dumb buffers and userptr.
|
|
|
|
To support this new capability, while also providing a uAPI which will work
|
|
beyond just DG1, we propose to offer three new bits of uAPI:
|
|
|
|
DRM_I915_QUERY_MEMORY_REGIONS
|
|
-----------------------------
|
|
New query ID which allows userspace to discover the list of supported memory
|
|
regions(like system-memory and local-memory) for a given device. We identify
|
|
each region with a class and instance pair, which should be unique. The class
|
|
here would be DEVICE or SYSTEM, and the instance would be zero, on platforms
|
|
like DG1.
|
|
|
|
Side note: The class/instance design is borrowed from our existing engine uAPI,
|
|
where we describe every physical engine in terms of its class, and the
|
|
particular instance, since we can have more than one per class.
|
|
|
|
In the future we also want to expose more information which can further
|
|
describe the capabilities of a region.
|
|
|
|
.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
|
|
:functions: drm_i915_gem_memory_class drm_i915_gem_memory_class_instance drm_i915_memory_region_info drm_i915_query_memory_regions
|
|
|
|
GEM_CREATE_EXT
|
|
--------------
|
|
New ioctl which is basically just gem_create but now allows userspace to provide
|
|
a chain of possible extensions. Note that if we don't provide any extensions and
|
|
set flags=0 then we get the exact same behaviour as gem_create.
|
|
|
|
Side note: We also need to support PXP[1] in the near future, which is also
|
|
applicable to integrated platforms, and adds its own gem_create_ext extension,
|
|
which basically lets userspace mark a buffer as "protected".
|
|
|
|
.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
|
|
:functions: drm_i915_gem_create_ext
|
|
|
|
I915_GEM_CREATE_EXT_MEMORY_REGIONS
|
|
----------------------------------
|
|
Implemented as an extension for gem_create_ext, we would now allow userspace to
|
|
optionally provide an immutable list of preferred placements at creation time,
|
|
in priority order, for a given buffer object. For the placements we expect
|
|
them each to use the class/instance encoding, as per the output of the regions
|
|
query. Having the list in priority order will be useful in the future when
|
|
placing an object, say during eviction.
|
|
|
|
.. kernel-doc:: Documentation/gpu/rfc/i915_gem_lmem.h
|
|
:functions: drm_i915_gem_create_ext_memory_regions
|
|
|
|
One fair criticism here is that this seems a little over-engineered[2]. If we
|
|
just consider DG1 then yes, a simple gem_create.flags or something is totally
|
|
all that's needed to tell the kernel to allocate the buffer in local-memory or
|
|
whatever. However looking to the future we need uAPI which can also support
|
|
upcoming Xe HP multi-tile architecture in a sane way, where there can be
|
|
multiple local-memory instances for a given device, and so using both class and
|
|
instance in our uAPI to describe regions is desirable, although specifically
|
|
for DG1 it's uninteresting, since we only have a single local-memory instance.
|
|
|
|
Existing uAPI issues
|
|
====================
|
|
Some potential issues we still need to resolve.
|
|
|
|
I915 MMAP
|
|
---------
|
|
In i915 there are multiple ways to MMAP GEM object, including mapping the same
|
|
object using different mapping types(WC vs WB), i.e multiple active mmaps per
|
|
object. TTM expects one MMAP at most for the lifetime of the object. If it
|
|
turns out that we have to backpedal here, there might be some potential
|
|
userspace fallout.
|
|
|
|
I915 SET/GET CACHING
|
|
--------------------
|
|
In i915 we have set/get_caching ioctl. TTM doesn't let us to change this, but
|
|
DG1 doesn't support non-snooped pcie transactions, so we can just always
|
|
allocate as WB for smem-only buffers. If/when our hw gains support for
|
|
non-snooped pcie transactions then we must fix this mode at allocation time as
|
|
a new GEM extension.
|
|
|
|
This is related to the mmap problem, because in general (meaning, when we're
|
|
not running on intel cpus) the cpu mmap must not, ever, be inconsistent with
|
|
allocation mode.
|
|
|
|
Possible idea is to let the kernel picks the mmap mode for userspace from the
|
|
following table:
|
|
|
|
smem-only: WB. Userspace does not need to call clflush.
|
|
|
|
smem+lmem: We only ever allow a single mode, so simply allocate this as uncached
|
|
memory, and always give userspace a WC mapping. GPU still does snooped access
|
|
here(assuming we can't turn it off like on DG1), which is a bit inefficient.
|
|
|
|
lmem only: always WC
|
|
|
|
This means on discrete you only get a single mmap mode, all others must be
|
|
rejected. That's probably going to be a new default mode or something like
|
|
that.
|
|
|
|
Links
|
|
=====
|
|
[1] https://patchwork.freedesktop.org/series/86798/
|
|
|
|
[2] https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599#note_553791
|