IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
vms are not getting properly closed. Rather than fixing that,
Remove the vm open count and instead rely on the vm refcount.
The vm open count existed solely to break the strong references the
vmas had on the vms. Now instead make those references weak and
ensure vmas are destroyed when the vm is destroyed.
Unfortunately if the vm destructor and the object destructor both
wants to destroy a vma, that may lead to a race in that the vm
destructor just unbinds the vma and leaves the actual vma destruction
to the object destructor. However in order for the object destructor
to ensure the vma is unbound it needs to grab the vm mutex. In order
to keep the vm mutex alive until the object destructor is done with
it, somewhat hackishly grab a vm_resv refcount that is released late
in the vma destruction process, when the vm mutex is no longer needed.
v2: Address review-comments from Niranjana
- Clarify that the struct i915_address_space::skip_pte_rewrite is a hack
and should ideally be replaced in an upcoming patch.
- Remove an unneeded continue in clear_vm_list and update comment.
v3:
- Documentation update
- Commit message formatting
Co-developed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Niranjana Vishwanathapura <niranjana.vishwanathapura@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220304082641.308069-2-thomas.hellstrom@linux.intel.com
The current implementation of i915 prime mmap only works when initializing
drm_i915_gem_object with shmem_region.
When using LMEM, drm_i915_gem_object is initialized with ttm_system_region.
In order to make prime mmap work even this case, when using LMEM
(when using ttm in i915), dma_buf_ops.mmap callback function calls
drm_gem_prime_mmap(). drm_gem_prime_mmap() of drm core calls internally
i915_gem_mmap() so that prime mmap can perform normally.
The fake offset is processed inside drm_gem_prime_mmap().
Testcase: igt/prime_mmap
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225131316.1433515-3-gwan-gyeong.mun@intel.com
The dma_buf_ops.unmap_dma_buf callback used in i915,
i915_gem_unmap_dma_buf(), has the same code as drm_gem_unmap_dma_buf().
In order to eliminate defining and using duplicate function, it updates
the dma_buf_ops.unmap_dma_buf callback to use drm_gem_unmap_dma_buf().
Signed-off-by: Gwan-gyeong Mun <gwan-gyeong.mun@intel.com>
Reviewed-by: Nirmoy Das <nirmoy.das@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225131316.1433515-2-gwan-gyeong.mun@intel.com
A different emit breadcrumbs ring programming is required for compute /
render and we don't have UMD user so just reject parallel submission for
these engine classes.
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-11-matthew.d.roper@intel.com
The compute engine handles the same commands the render engine can
(except 3D pipeline), so it makes sense that CCS is more similar to RCS
than non-render engines.
The CCS context state (lrc) is also similar to the render one, so reuse
it. Note that the compute engine has its own CTX_R_PWR_CLK_STATE
register.
In order to avoid having multiple RCS && CCS checks, add the following
engine flag:
- I915_ENGINE_HAS_RCS_REG_STATE - use the render (larger) reg state ctx.
BSpec: 46260
Original-author: Michel Thierry
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Signed-off-by: Aravind Iddamsetty <aravind.iddamsetty@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220301231549.1817978-6-matthew.d.roper@intel.com
Exercise each of the migration scenarios, verifying that the final
placement and buffer contents match our expectations.
v2(Thomas): Replace for_i915_gem_ww() block with simpler object_lock()
v3:
- For testing purposes allow forcing the io_size such that we can
exercise the allocation + migration path on devices that don't have the
small BAR limit.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228123607.580432-4-matthew.auld@intel.com
If we have to contend with non-mappable LMEM, then we need to ensure the
object fits within the mappable portion, like in the selftests, where we
later try to CPU access the pages. However if it can't then we need to
gracefully handle this, without throwing an error.
Also it looks like TTM will return -ENOMEM, in ttm_bo_mem_space() after
exhausting all possible placements.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228123607.580432-3-matthew.auld@intel.com
The end goal is to have userspace tell the kernel what buffers will
require CPU access, however if we ever reach the CPU fault handler, and
the current resource is not mappable, then we should attempt to migrate
the buffer to the mappable portion of LMEM, or even system memory, if the
allowable placements permit it.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228123607.580432-2-matthew.auld@intel.com
If we need to make room for some mappable object, then we should
only victimize objects that have one or pages that occupy the visible
portion of LMEM. Let's also create a new priority hint for objects that
are placed in mappable memory, where we know that CPU access was
requested, that way we hopefully victimize these last.
v2(Thomas): s/TTM_PL_PRIV/I915_PL_LMEM0/
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220228123607.580432-1-matthew.auld@intel.com
It's unclear what reference the initial vma kref reference refers to.
A vma can have multiple weak references, the object vma list,
the vm's bound list and the GT's closed_list, and the initial vma
reference can be put from lookups of all these lists.
With the current implementation this means
that any holder of yet another vma refcount (currently only
i915_gem_object_unbind()) needs to be holding two of either
*) An object refcount,
*) A vm open count
*) A vma open count
in order for us to not risk leaking a reference by having the
initial vma reference being put twice.
Address this by re-introducing i915_vma_destroy() which removes all
weak references of the vma and *then* puts the initial vma refcount.
This makes a strong vma reference hold on to the vma unconditionally.
Perhaps a better name would be i915_vma_revoke() or i915_vma_zombify(),
since other callers may still hold a refcount, but with the prospect of
being able to replace the vma refcount with the object lock in the near
future, let's stick with i915_vma_destroy().
Finally this commit fixes a race in that previously i915_vma_release() and
now i915_vma_destroy() could destroy a vma without taking the vm->mutex
after an advisory check that the vma mm_node was not allocated.
This would race with the ungrab_vma() function creating a trace similar
to the below one. This was fixed in one of the __i915_vma_put() callsites
in
commit bc1922e5d349 ("drm/i915: Fix a race between vma / object destruction and unbinding")
but although not seemingly triggered by CI, that
is not sufficient. This patch is needed to fix that properly.
[823.012188] Console: switching to colour dummy device 80x25
[823.012422] [IGT] gem_ppgtt: executing
[823.016667] [IGT] gem_ppgtt: starting subtest blt-vs-render-ctx0
[852.436465] stack segment: 0000 [#1] PREEMPT SMP NOPTI
[852.436480] CPU: 0 PID: 3200 Comm: gem_ppgtt Not tainted 5.16.0-CI-CI_DRM_11115+ #1
[852.436489] Hardware name: Intel Corporation Alder Lake Client Platform/AlderLake-P DDR5 RVP, BIOS ADLPFWI1.R00.2422.A00.2110131104 10/13/2021
[852.436499] RIP: 0010:ungrab_vma+0x9/0x80 [i915]
[852.436711] Code: ef e8 4b 85 cf e0 e8 36 a3 d6 e0 8b 83 f8 9c 00 00 85 c0 75 e1 5b 5d 41 5c 41 5d c3 e9 d6 fd 14 00 55 53 48 8b af c0 00 00 00 <8b> 45 00 85 c0 75 03 5b 5d c3 48 8b 85 a0 02 00 00 48 89 fb 48 8b
[852.436727] RSP: 0018:ffffc90006db7880 EFLAGS: 00010246
[852.436734] RAX: 0000000000000000 RBX: ffffc90006db7598 RCX: 0000000000000000
[852.436742] RDX: ffff88815349e898 RSI: ffff88815349e858 RDI: ffff88810a284140
[852.436748] RBP: 6b6b6b6b6b6b6b6b R08: ffff88815349e898 R09: ffff88815349e8e8
[852.436754] R10: 0000000000000001 R11: 0000000051ef1141 R12: ffff88810a284140
[852.436762] R13: 0000000000000000 R14: ffff88815349e868 R15: ffff88810a284458
[852.436770] FS: 00007f5c04b04e40(0000) GS:ffff88849f000000(0000) knlGS:0000000000000000
[852.436781] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[852.436788] CR2: 00007f5c04b38fe0 CR3: 000000010a6e8001 CR4: 0000000000770ef0
[852.436797] PKRU: 55555554
[852.436801] Call Trace:
[852.436806] <TASK>
[852.436811] i915_gem_evict_for_node+0x33c/0x3c0 [i915]
[852.437014] i915_gem_gtt_reserve+0x106/0x130 [i915]
[852.437211] i915_vma_pin_ww+0x8f4/0xb60 [i915]
[852.437412] eb_validate_vmas+0x688/0x860 [i915]
[852.437596] i915_gem_do_execbuffer+0xc0e/0x25b0 [i915]
[852.437770] ? deactivate_slab+0x5f2/0x7d0
[852.437778] ? _raw_spin_unlock_irqrestore+0x50/0x60
[852.437789] ? i915_gem_execbuffer2_ioctl+0xc6/0x2c0 [i915]
[852.437944] ? init_object+0x49/0x80
[852.437950] ? __lock_acquire+0x5e6/0x2580
[852.437963] i915_gem_execbuffer2_ioctl+0x116/0x2c0 [i915]
[852.438129] ? i915_gem_do_execbuffer+0x25b0/0x25b0 [i915]
[852.438300] drm_ioctl_kernel+0xac/0x140
[852.438310] drm_ioctl+0x201/0x3d0
[852.438316] ? i915_gem_do_execbuffer+0x25b0/0x25b0 [i915]
[852.438490] __x64_sys_ioctl+0x6a/0xa0
[852.438498] do_syscall_64+0x37/0xb0
[852.438507] entry_SYSCALL_64_after_hwframe+0x44/0xae
[852.438515] RIP: 0033:0x7f5c0415b317
[852.438523] Code: b3 66 90 48 8b 05 71 4b 2d 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 41 4b 2d 00 f7 d8 64 89 01 48
[852.438542] RSP: 002b:00007ffd765039a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[852.438553] RAX: ffffffffffffffda RBX: 000055e4d7829dd0 RCX: 00007f5c0415b317
[852.438562] RDX: 00007ffd76503a00 RSI: 00000000c0406469 RDI: 0000000000000017
[852.438571] RBP: 00007ffd76503a00 R08: 0000000000000000 R09: 0000000000000081
[852.438579] R10: 00000000ffffff7f R11: 0000000000000246 R12: 00000000c0406469
[852.438587] R13: 0000000000000017 R14: 00007ffd76503a00 R15: 0000000000000000
[852.438598] </TASK>
[852.438602] Modules linked in: snd_hda_codec_hdmi i915 mei_hdcp x86_pkg_temp_thermal snd_hda_intel snd_intel_dspcfg drm_buddy coretemp crct10dif_pclmul crc32_pclmul snd_hda_codec ttm ghash_clmulni_intel snd_hwdep snd_hda_core e1000e drm_dp_helper ptp snd_pcm mei_me drm_kms_helper pps_core mei syscopyarea sysfillrect sysimgblt fb_sys_fops prime_numbers intel_lpss_pci smsc75xx usbnet mii
[852.440310] ---[ end trace e52cdd2fe4fd911c ]---
v2: Fix typos in the commit message.
Fixes: 7e00897be8bf ("drm/i915: Add object locking to i915_gem_evict_for_node and i915_gem_evict_something, v2.")
Fixes: bc1922e5d349 ("drm/i915: Fix a race between vma / object destruction and unbinding")
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220222133209.587978-1-thomas.hellstrom@linux.intel.com
If the user doesn't require CPU access for the buffer, then
ALLOC_GPU_ONLY should be used, in order to prioritise allocating in the
non-mappable portion of LMEM, on devices with small BAR.
v2(Thomas):
- The BO_ALLOC_TOPDOWN naming here is poor, since this is pure lies on
systems that don't even have small BAR. A better name is GPU_ONLY,
which is accurate regardless of the configuration.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225145502.331818-3-matthew.auld@intel.com
On devices with non-mappable LMEM ensure we always allocate the pages
within the mappable portion. For now we assume that all LMEM buffers
will require CPU access, which is also inline with pretty much all
current kernel internal users. In the next patch we will introduce a new
flag to override this behaviour.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225145502.331818-2-matthew.auld@intel.com
With small LMEM-BAR we need to be able to differentiate between the
total size of LMEM, and how much of it is CPU mappable. The end goal is
to be able to utilize the entire range, even if part of is it not CPU
accessible.
v2: also update intelfb_create
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Acked-by: Nirmoy Das <nirmoy.das@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220225145502.331818-1-matthew.auld@intel.com
Add check for zero usable stolen memory before calling drm_mm_init
to support configurations where stolen memory exists but is fully
reserved.
Also skip memory test in cases that usable stolen is smaller than
page size(amount mapped and used to test memory).
v2:
- skiping test if available memory is smaller than page size (Lucas)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Cc: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Signed-off-by: Steve Carbonari <steven.carbonari@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220223194946.725328-1-jose.souza@intel.com
UAPI Changes:
- Weak parallel submission support for execlists
Minimal implementation of the parallel submission support for
execlists backend that was previously only implemented for GuC.
Support one sibling non-virtual engine.
Core Changes:
- Two backmerges of drm/drm-next for header file renames/changes and
i915_regs reorganization
Driver Changes:
- Add new DG2 subplatform: DG2-G12 (Matt R)
- Add new DG2 workarounds (Matt R, Ram, Bruce)
- Handle pre-programmed WOPCM registers for DG2+ (Daniele)
- Update guc shim control programming on XeHP SDV+ (Daniele)
- Add RPL-S C0/D0 stepping information (Anusha)
- Improve GuC ADS initialization to work on ARM64 on dGFX (Lucas)
- Fix KMD and GuC race on accessing PMU busyness (Umesh)
- Use PM timestamp instead of RING TIMESTAMP for reference in PMU with GuC (Umesh)
- Report error on invalid reset notification from GuC (John)
- Avoid WARN splat by holding RPM wakelock during PXP unbind (Juston)
- Fixes to parallel submission implementation (Matt B.)
- Improve GuC loading status check/error reports (John)
- Tweak TTM LRU priority hint selection (Matt A.)
- Align the plane_vma to min_page_size of stolen mem (Ram)
- Introduce vma resources and implement async unbinding (Thomas)
- Use struct vma_resource instead of struct vma_snapshot (Thomas)
- Return some TTM accel move errors instead of trying memcpy move (Thomas)
- Fix a race between vma / object destruction and unbinding (Thomas)
- Remove short-term pins from execbuf (Maarten)
- Update to GuC version 69.0.3 (John, Michal Wa.)
- Improvements to GT reset paths in GuC backend (Matt B.)
- Use shrinker_release_pages instead of writeback in shmem object hooks (Matt A., Tvrtko)
- Use trylock instead of blocking lock when freeing GEM objects (Maarten)
- Allocate intel_engine_coredump_alloc with ALLOW_FAIL (Matt B.)
- Fixes to object unmapping and purging (Matt A)
- Check for wedged device in GuC backend (John)
- Avoid lockdep splat by locking dpt_obj around set_cache_level (Maarten)
- Allow dead vm to unbind vma's without lock (Maarten)
- s/engine->i915/i915/ for DG2 engine workarounds (Matt R)
- Use to_gt() helper for GGTT accesses (Michal Wi.)
- Selftest improvements (Matt B., Thomas, Ram)
- Coding style and compiler warning fixes (Matt B., Jasmine, Andi, Colin, Gustavo, Dan)
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/Yg4i2aCZvvee5Eai@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
[Fixed conflicts while applying, using the fixups/drm-intel-gt-next.patch
from drm-rerere's 1f2b1742abdd ("2022y-02m-23d-16h-07m-57s UTC: drm-tip
rerere cache update")]
discrete cards optimise 64K GTT pages for local-memory, since everything
should be allocated at 64K granularity. We say goodbye to sparse
entries, and instead get a compact 256B page-table for 64K pages,
which should be more cache friendly. 4K pages for local-memory
are no longer supported by the HW.
v4: don't return uninitialized err in igt_ppgtt_compact
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Stuart Summers <stuart.summers@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218184752.7524-8-ramalingam.c@intel.com
For local-memory objects we need to align the GTT addresses
to 64K, both for the ppgtt and ggtt.
We need to support vm->min_alignment > 4K, depending
on the vm itself and the type of object we are inserting.
With this in mind update the GTT selftests to take this
into account.
For compact-pt we further align and pad lmem object GTT addresses
to 2MB to ensure PDEs contain consistent page sizes as
required by the HW.
v3:
* use needs_compact_pt flag to discriminate between
64K and 64K with compact-pt
* add i915_vm_obj_min_alignment
* use i915_vm_obj_min_alignment to round up vma reservation
if compact-pt instead of hard coding
v5:
* fix i915_vm_obj_min_alignment for internal objects which
have no memory region
v6:
* tiled_blits_create correctly pick largest required alignment
v8:
* i915_vm_min_alignment protect against array overflow for mock region
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Ramalingam C <ramalingam.c@intel.com>
Signed-off-by: Robert Beckett <bob.beckett@collabora.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220218184752.7524-7-ramalingam.c@intel.com
Registers that exist within the MCH BAR and are mirrored into the GPU's
MMIO space are a good candidate to separate out into their own header.
For reference, the mirror of the MCH BAR starts at the following
locations in the graphics MMIO space (the end of the MCHBAR range
differs slightly on each platform):
* Pre-gen6: 0x10000
* Gen6-Gen11 + RKL: 0x140000
v2:
- Create separate patch to swtich a few register definitions to be
relative to the MCHBAR mirror base.
- Drop upper bound of MCHBAR mirror from commit message; there are too
many different combinations between various platforms to list out,
and the documentation is spotty for the older pre-gen6 platforms
anyway.
Bspec: 134, 51771
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Suggested-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220215061342.2055952-2-matthew.d.roper@intel.com
Include drm_fourcc.h, drm_plane.h, and drm_color_mgmt.h where needed, so
we can drop the includes for drm_atomic.h and drm_fourcc.h from
i915_drv.h, reducing the build dependencies.
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/b03711b2286396b2e9d5822f6adef4e7a6dc0f7b.1644507885.git.jani.nikula@intel.com
For some reason we are selecting PRIO_HAS_PAGES when we don't have
mm.pages, and vice versa.
v2(Thomas):
- Add missing fixes tag
Fixes: 213d50927763 ("drm/i915/ttm: Introduce a TTM i915 gem object backend")
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220209111652.468762-1-matthew.auld@intel.com
Rename struct dma_buf_map to struct iosys_map and corresponding APIs.
Over time dma-buf-map grew up to more functionality than the one used by
dma-buf: in fact it's just a shim layer to abstract system memory, that
can be accessed via regular load and store, from IO memory that needs to
be acessed via arch helpers.
The idea is to extend this API so it can fulfill other needs, internal
to a single driver. Example: in the i915 driver it's desired to share
the implementation for integrated graphics, which uses mostly system
memory, with discrete graphics, which may need to access IO memory.
The conversion was mostly done with the following semantic patch:
@r1@
@@
- struct dma_buf_map
+ struct iosys_map
@r2@
@@
(
- DMA_BUF_MAP_INIT_VADDR
+ IOSYS_MAP_INIT_VADDR
|
- dma_buf_map_set_vaddr
+ iosys_map_set_vaddr
|
- dma_buf_map_set_vaddr_iomem
+ iosys_map_set_vaddr_iomem
|
- dma_buf_map_is_equal
+ iosys_map_is_equal
|
- dma_buf_map_is_null
+ iosys_map_is_null
|
- dma_buf_map_is_set
+ iosys_map_is_set
|
- dma_buf_map_clear
+ iosys_map_clear
|
- dma_buf_map_memcpy_to
+ iosys_map_memcpy_to
|
- dma_buf_map_incr
+ iosys_map_incr
)
@@
@@
- #include <linux/dma-buf-map.h>
+ #include <linux/iosys-map.h>
Then some files had their includes adjusted and some comments were
update to remove mentions to dma-buf-map.
Since this is not specific to dma-buf anymore, move the documentation to
the "Bus-Independent Device Accesses" section.
v2:
- Squash patches
v3:
- Fix wrong removal of dma-buf.h from MAINTAINERS
- Move documentation from dma-buf.rst to device-io.rst
v4:
- Change documentation title and level
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Sumit Semwal <sumit.semwal@linaro.org>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20220204170541.829227-1-lucas.demarchi@intel.com
Backmerge to bring in 5.17-rc2 to introduce a common baseline
to merge i915_regs changes from drm-intel-next.
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Several of our i915 header files, have been including i915_reg.h. This
means that any change to i915_reg.h will trigger a full rebuild of
pretty much every file of the driver, even those that don't have any
kind of register access. Let's delete the i915_reg.h include from all
headers and add an explicit include from the .c files that truly
need the register definitions; those that need a definition of
i915_reg_t for a function definition can get it from i915_reg_defs.h
instead.
We also remove two non-register #define's (VLV_DISPLAY_BASE and
GEN12_SFC_DONE_MAX) into i915_reg_defs.h to allow us to drop the
i915_reg.h include from a couple of headers.
There's probably a lot more header dependency optimization possible, but
the changes here roughly cut the number of files compiled after 'touch
i915_reg.h' in half --- a good first step.
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-7-matthew.d.roper@intel.com
This is a huge, chaotic mass of registers copied over as-is without any
real cleanup. We'll come back and organize these better, align on
consistent coding style, remove dead code, etc. in separate patches
later that will be easier to review.
v2:
- Add missing include in intel_pxp_irq.c
v3:
- Correct a few indentation errors (Lucas)
- Minor conflict resolution
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-6-matthew.d.roper@intel.com
At the moment we only use R_PWR_CLK_STATE in the context of the RCS
engine, but upcoming support for compute engines will start using
instances relative to the CCS engine base offsets. Let's parameterize
the register and move it to the engine reg header.
Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-4-matthew.d.roper@intel.com
The i915_ttm_accel_move() function may return error codes that should
be propagated further up the stack rather than consumed assuming that
the accel move failed and could be replaced with a memcpy move.
For -EINTR, -ERESTARTSYS and -EAGAIN, just propagate those codes, rather
than retrying with a memcpy move.
Fixes: 2b0a750caf33 ("drm/i915/ttm: Failsafe migration blits")
Cc: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220201070340.16457-1-thomas.hellstrom@linux.intel.com
Catch-up with 5.17-rc2 and trying to align with drm-intel-gt-next
for a possible topic branch for merging the split of i915_regs...
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
The vma destruction code was using an unlocked advisory check for
drm_mm_node_allocated() to avoid racing with eviction code unbinding
the vma.
This is very fragile and prohibits the dereference of non-refcounted
pointers of dying vmas after a call to __i915_vma_unbind(). It also
prohibits the dereference of vma->obj of refcounted pointers of
dying vmas after a call to __i915_vma_unbind(), since even if a
refcount is held on the vma, that won't guarantee that its backing
object doesn't get destroyed.
So introduce an unbind under the vm mutex at object destroy time,
removing all weak references of the vma and its object from the
object vma list and from the vm bound list.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127115622.302970-1-thomas.hellstrom@linux.intel.com
We need to flush TLBs before releasing backing store otherwise userspace
is able to encounter stale entries if a) it is not declaring access to
certain buffers and b) it races with the backing store release from a
such undeclared execution already executing on the GPU in parallel.
The approach taken is to mark any buffer objects which were ever bound
to the GPU and to trigger a serialized TLB flush when their backing
store is released.
Alternatively the flushing could be done on VMA unbind, at which point
we would be able to ascertain whether there is potential a parallel GPU
execution (which could race), but essentially it boils down to paying
the cost of TLB flushes potentially needlessly at VMA unbind time (when
the backing store is not known to be going away so not needed for
safety), versus potentially needlessly at backing store relase time
(since we at that point cannot tell whether there is anything executing
on the GPU which uses that object).
Thereforce simplicity of implementation has been chosen for now with
scope to benchmark and refine later as required.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reported-by: Sushma Venkatesh Reddy <sushma.venkatesh.reddy@intel.com>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Acked-by: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Jon Bloomfield <jon.bloomfield@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jani Nikula <jani.nikula@intel.com>
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Don't use the interruptable version of the timeline mutex lock in the
error path of eb_pin_timeline as the cleanup must always happen.
v2:
(John Harrison)
- Don't check for interrupt during mutex lock
v3:
(Tvrtko)
- A comment explaining why lock helper isn't used
Fixes: 544460c33821 ("drm/i915: Multi-BB execbuf")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: John Harrison <John.C.Harrison@Intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220111163929.14017-1-matthew.brost@intel.com
Add a flag PIN_VALIDATE, to indicate we don't need to pin and only
protected by the object lock.
This removes the need to unpin, which is done by just releasing the
lock.
eb_reserve is slightly reworked for readability, but the same steps
are still done:
- First pass pins with NONBLOCK.
- Second pass unbinds all objects first, then pins.
- Third pass is only called when not all objects are softpinned, and
unbinds all objects, then calls i915_gem_evict_vm(), then pins.
Changes since v1:
- Split out eb_reserve() into separate functions for readability.
Changes since v2:
- Make batch buffer mappable on platforms where only GGTT is available,
to prevent moving the batch buffer during relocations.
Changes since v3:
- Preserve current behavior for batch buffer, instead be cautious when
calling i915_gem_object_ggtt_pin_ww, and re-use the current batch vma
if it's inside ggtt and map-and-fenceable.
- Remove impossible condition check from eb_reserve. (Matt)
Changes since v5:
- Do not even temporarily pin, just call i915_gem_evict_vm() and mark
all vma's as unpinned.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-7-maarten.lankhorst@linux.intel.com
We want to remove more members of i915_vma, which requires the locking to
be held more often.
Start requiring gem object lock for i915_vma_unbind, as it's one of the
callers that may unpin pages.
Some special care is needed when evicting, because the last reference to
the object may be held by the VMA, so after __i915_vma_unbind, vma may be
garbage, and we need to cache vma->obj before unlocking.
Changes since v1:
- Make trylock failing a WARN. (Matt)
- Remove double i915_vma_wait_for_bind() (Matt)
- Move atomic_set to right before mutex_unlock(), to make it more clear
they belong together. (Matt)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.william.auld@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-5-maarten.lankhorst@linux.intel.com
i915_gem_evict_vm will need to be able to evict objects that are
locked by the current ctx. By testing if the current context already
locked the object, we can do this correctly. This allows us to
evict the entire vm even if we already hold some objects' locks.
Previously, this was spread over several commits, but it makes
more sense to commit the changes to i915_gem_evict_vm separately
from the changes to i915_gem_evict_something() and
i915_gem_evict_for_node().
Changes since v1:
- Handle evicting dead objects better.
Changes since v2:
- Use for_i915_gem_ww in igt_evict_vm. (Thomas)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
[mlankhorst: Fix up doc warning.]
Link: https://patchwork.freedesktop.org/patch/msgid/20220117075604.131477-1-maarten.lankhorst@linux.intel.com
Now that we cannot unbind kill the currently locked object directly
because we're removing short term pinning, we may have to unbind the
object from gtt manually, using a i915_gem_evict_vm() call.
Changes since v1:
- Remove -ENOSPC warning, can still happen with concurrent mmaps
where we can't unbind the other mmap because of the lock held.
This fixes the gem_mmap_gtt@cpuset tests.
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220114132320.109030-2-maarten.lankhorst@linux.intel.com
Maarten needs backmerge to account for header file renames/changes which
landed via drm-intel-next and are interfering with his pinning work.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
drivers fixes:
- i915 fixes for ttm backend + one pm wakelock fix
- amdgpu fixes, fairly big pile of small things all over. Note this
doesn't yet containe the fixed version of the otg sync patch that
blew up
- small driver fixes: meson, sun4i, vga16fb probe fix
drm core fixes:
- cma-buf heap locking
- ttm compilation
- self refresh helper state check
- wrong error message in atomic helpers
- mipi-dbi buffer mapping
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEb4nG6jLu8Y5XI+PfTA9ye/CYqnEFAmHhrxUACgkQTA9ye/CY
qnEowxAAgBPwGEobRGMbR3Me98vEKvcWqSxBe/k1VC4LhO5DrvbG5iW9cuxCJZM2
wlGlGAtU7C7pcCP5Xp1UlMqZ5a0rSVhqMPPkMKO9+7033ofSlAQatnMI1EENH6Hn
BkhXwTyuOBSN6zqskg8FKqzF+VPTt5ZV2U5qJzQweP/wFtZPAKI4tWE4oKiHactH
fJHnAi7T6ytF6a7J21BsSEluk4z7BjmcmFF0tW6iuq7Y6TXDFXFq9QFDR041b2rI
GYDUXl2mebp/L+2M3sPYuMiIiyJ8enh7crNIdmi+EstmzRADa7RMjnY3j2tJg/7M
pqnuJZAVcpkCurb7NMr3ycmrxnhfUsZfbuXvm+k5yJYfQCaGNiKy1ObsFWH9zBDz
XMuxcE+csSaX/7rjoyXrL2ZTRPXnVwJNJ8x1CuKn3giLxMSqnPnDMjyHmNLB8qa1
R0wbPQbdx5+jWgs/ngUGFNo4vFBnNmqQP4G3LaWJ/Ku5cSrEM+Jt9GJOw5jjVgun
gaOlKlUpMBlKmXOwkvhRW2AwHRcL7lrBuIw0oFOThdMzkSNlKSNBMpAkgvD/9C7g
IDtJgA7a3MqzLjhPLmhUB3rFyXz5dg5rNZhoH8z4DFiJVpTKYYA5/UoU/RTXZGqW
yFdrBkFmNJeKTZZAe+e/4OP/dm1vKVEKK6Ko/M9CrELzBKSussg=
=h74X
-----END PGP SIGNATURE-----
Merge tag 'drm-next-2022-01-14' of git://anongit.freedesktop.org/drm/drm
Pull drm fixes from Daniel Vetter:
"drivers fixes:
- i915 fixes for ttm backend + one pm wakelock fix
- amdgpu fixes, fairly big pile of small things all over. Note this
doesn't yet containe the fixed version of the otg sync patch that
blew up
- small driver fixes: meson, sun4i, vga16fb probe fix
drm core fixes:
- cma-buf heap locking
- ttm compilation
- self refresh helper state check
- wrong error message in atomic helpers
- mipi-dbi buffer mapping"
* tag 'drm-next-2022-01-14' of git://anongit.freedesktop.org/drm/drm: (49 commits)
drm/mipi-dbi: Fix source-buffer address in mipi_dbi_buf_copy
drm: fix error found in some cases after the patch d1af5cd86997
drm/ttm: fix compilation on ARCH=um
dma-buf: cma_heap: Fix mutex locking section
video: vga16fb: Only probe for EGA and VGA 16 color graphic cards
drm/amdkfd: Fix ASIC name typos
drm/amdkfd: Fix DQM asserts on Hawaii
drm/amdgpu: Use correct VIEWPORT_DIMENSION for DCN2
drm/amd/pm: only send GmiPwrDnControl msg on master die (v3)
drm/amdgpu: use spin_lock_irqsave to avoid deadlock by local interrupt
drm/amdgpu: not return error on the init_apu_flags
drm/amdkfd: Use prange->update_list head for remove_list
drm/amdkfd: Use prange->list head for insert_list
drm/amdkfd: make SPDX License expression more sound
drm/amdkfd: Check for null pointer after calling kmemdup
drm/amd/display: invalid parameter check in dmub_hpd_callback
Revert "drm/amdgpu: Don't inherit GEM object VMAs in child process"
drm/amd/display: reset dcn31 SMU mailbox on failures
drm/amdkfd: use default_groups in kobj_type
drm/amdgpu: use default_groups in kobj_type
...
There is always a struct vma_resource guaranteed to be alive when we
access a corresponding struct vma_snapshot.
So ditch the latter and instead of allocating vma_snapshots, reference
the already existning vma_resource.
This requires a couple of extra members in struct vma_resource but that's
a small price to pay for the simplification.
v2:
- Fix a missing include and declaration (kernel test robot <lkp@intel.com>)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-7-thomas.hellstrom@linux.intel.com
Add a selftest to exercise asynchronous migration and -unbining.
Extend the gem_migrate selftest to perform the migrations while
depending on a spinner and a bound vma set up on the migrated
buffer object.
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-6-thomas.hellstrom@linux.intel.com
Implement async (non-blocking) unbinding by not syncing the vma before
calling unbind on the vma_resource.
Add the resulting unbind fence to the object's dma_resv from where it is
picked up by the ttm migration code.
Ideally these unbind fences should be coalesced with the migration blit
fence to avoid stalling the migration blit waiting for unbind, as they
can certainly go on in parallel, but since we don't yet have a
reasonable data structure to use to coalesce fences and attach the
resulting fence to a timeline, we defer that for now.
Note that with async unbinding, even while the unbind waits for the
preceding bind to complete before unbinding, the vma itself might have been
destroyed in the process, clearing the vma pages. Therefore we can
only allow async unbinding if we have a refcounted sg-list and keep a
refcount on that for the vma resource pages to stay intact until
binding occurs. If this condition is not met, a request for an async
unbind is diverted to a sync unbind.
v2:
- Use a separate kmem_cache for vma resources for now to isolate their
memory allocation and aid debugging.
- Move the check for vm closed to the actual unbinding thread. Regardless
of whether the vm is closed, we need the unbind fence to properly wait
for capture.
- Clear vma_res::vm on unbind and update its documentation.
v4:
- Take cache coloring into account when searching for vma resources
pending unbind. (Matthew Auld)
v5:
- Fix timeout and error check in i915_vma_resource_bind_dep_await().
- Avoid taking a reference on the object for async binding if
async unbind capable.
- Fix braces around a single-line if statement.
v6:
- Fix up the cache coloring adjustment. (Kernel test robot <lkp@intel.com>)
- Don't allow async unbinding if the vma_res pages are not the same as
the object pages. (Matthew Auld)
v7:
- s/unsigned long/u64/ in a number of places (Matthew Auld)
Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220110172219.107131-5-thomas.hellstrom@linux.intel.com