2014-07-24 20:04:10 +04:00
/*
* Copyright © 2014 Intel Corporation
*
* Permission is hereby granted , free of charge , to any person obtaining a
* copy of this software and associated documentation files ( the " Software " ) ,
* to deal in the Software without restriction , including without limitation
* the rights to use , copy , modify , merge , publish , distribute , sublicense ,
* and / or sell copies of the Software , and to permit persons to whom the
* Software is furnished to do so , subject to the following conditions :
*
* The above copyright notice and this permission notice ( including the next
* paragraph ) shall be included in all copies or substantial portions of the
* Software .
*
* THE SOFTWARE IS PROVIDED " AS IS " , WITHOUT WARRANTY OF ANY KIND , EXPRESS OR
* IMPLIED , INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY ,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT . IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM , DAMAGES OR OTHER
* LIABILITY , WHETHER IN AN ACTION OF CONTRACT , TORT OR OTHERWISE , ARISING
* FROM , OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
* DEALINGS IN THE SOFTWARE .
*/
# ifndef _INTEL_LRC_H_
# define _INTEL_LRC_H_
drm/i915/bdw: Pin the context backing objects to GGTT on-demand
Up until now, we have pinned every logical ring context backing object
during creation, and left it pinned until destruction. This made my life
easier, but it's a harmful thing to do, because we cause fragmentation
of the GGTT (and, eventually, we would run out of space).
This patch makes the pinning on-demand: the backing objects of the two
contexts that are written to the ELSP are pinned right before submission
and unpinned once the hardware is done with them. The only context that
is still pinned regardless is the global default one, so that the HWS can
still be accessed in the same way (ring->status_page).
v2: In the early version of this patch, we were pinning the context as
we put it into the ELSP: on the one hand, this is very efficient because
only a maximum two contexts are pinned at any given time, but on the other
hand, we cannot really pin in interrupt time :(
v3: Use a mutex rather than atomic_t to protect pin count to avoid races.
Do not unpin default context in free_request.
v4: Break out pin and unpin into functions. Fix style problems reported
by checkpatch
v5: Remove unpin_lock as all pinning and unpinning is done with the struct
mutex already locked. Add WARN_ONs to make sure this is the case in future.
Issue: VIZ-4277
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Reviewed-by: Akash Goel <akash.goels@gmail.com>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-13 13:28:10 +03:00
# define GEN8_LR_CONTEXT_ALIGN 4096
2014-08-07 16:23:20 +04:00
/* Execlists regs */
# define RING_ELSP(ring) ((ring)->mmio_base+0x230)
# define RING_EXECLIST_STATUS(ring) ((ring)->mmio_base+0x234)
# define RING_CONTEXT_CONTROL(ring) ((ring)->mmio_base+0x244)
# define RING_CONTEXT_STATUS_BUF(ring) ((ring)->mmio_base+0x370)
# define RING_CONTEXT_STATUS_PTR(ring) ((ring)->mmio_base+0x3a0)
2014-07-24 20:04:22 +04:00
/* Logical Rings */
void intel_logical_ring_stop ( struct intel_engine_cs * ring ) ;
void intel_logical_ring_cleanup ( struct intel_engine_cs * ring ) ;
int intel_logical_rings_init ( struct drm_device * dev ) ;
2014-07-24 20:04:29 +04:00
int logical_ring_flush_all_caches ( struct intel_ringbuffer * ringbuf ) ;
2014-07-24 20:04:26 +04:00
void intel_logical_ring_advance_and_submit ( struct intel_ringbuffer * ringbuf ) ;
2014-07-24 20:04:48 +04:00
/**
* intel_logical_ring_advance ( ) - advance the ringbuffer tail
* @ ringbuf : Ringbuffer to advance .
*
* The tail is only updated in our logical ringbuffer struct .
*/
2014-07-24 20:04:26 +04:00
static inline void intel_logical_ring_advance ( struct intel_ringbuffer * ringbuf )
{
ringbuf - > tail & = ringbuf - > size - 1 ;
}
2014-07-24 20:04:48 +04:00
/**
* intel_logical_ring_emit ( ) - write a DWORD to the ringbuffer .
* @ ringbuf : Ringbuffer to write to .
* @ data : DWORD to write .
*/
2014-07-24 20:04:26 +04:00
static inline void intel_logical_ring_emit ( struct intel_ringbuffer * ringbuf ,
u32 data )
{
iowrite32 ( data , ringbuf - > virtual_start + ringbuf - > tail ) ;
ringbuf - > tail + = 4 ;
}
int intel_logical_ring_begin ( struct intel_ringbuffer * ringbuf , int num_dwords ) ;
2014-07-24 20:04:12 +04:00
/* Logical Ring Contexts */
2014-08-21 14:40:54 +04:00
int intel_lr_context_render_state_init ( struct intel_engine_cs * ring ,
struct intel_context * ctx ) ;
2014-07-24 20:04:12 +04:00
void intel_lr_context_free ( struct intel_context * ctx ) ;
int intel_lr_context_deferred_create ( struct intel_context * ctx ,
struct intel_engine_cs * ring ) ;
drm/i915/bdw: Pin the context backing objects to GGTT on-demand
Up until now, we have pinned every logical ring context backing object
during creation, and left it pinned until destruction. This made my life
easier, but it's a harmful thing to do, because we cause fragmentation
of the GGTT (and, eventually, we would run out of space).
This patch makes the pinning on-demand: the backing objects of the two
contexts that are written to the ELSP are pinned right before submission
and unpinned once the hardware is done with them. The only context that
is still pinned regardless is the global default one, so that the HWS can
still be accessed in the same way (ring->status_page).
v2: In the early version of this patch, we were pinning the context as
we put it into the ELSP: on the one hand, this is very efficient because
only a maximum two contexts are pinned at any given time, but on the other
hand, we cannot really pin in interrupt time :(
v3: Use a mutex rather than atomic_t to protect pin count to avoid races.
Do not unpin default context in free_request.
v4: Break out pin and unpin into functions. Fix style problems reported
by checkpatch
v5: Remove unpin_lock as all pinning and unpinning is done with the struct
mutex already locked. Add WARN_ONs to make sure this is the case in future.
Issue: VIZ-4277
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
Reviewed-by: Akash Goel <akash.goels@gmail.com>
Reviewed-by: Deepak S<deepak.s@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-11-13 13:28:10 +03:00
void intel_lr_context_unpin ( struct intel_engine_cs * ring ,
struct intel_context * ctx ) ;
2014-07-24 20:04:12 +04:00
2014-07-24 20:04:11 +04:00
/* Execlists */
int intel_sanitize_enable_execlists ( struct drm_device * dev , int enable_execlists ) ;
2014-07-24 20:04:22 +04:00
int intel_execlists_submission ( struct drm_device * dev , struct drm_file * file ,
struct intel_engine_cs * ring ,
struct intel_context * ctx ,
struct drm_i915_gem_execbuffer2 * args ,
struct list_head * vmas ,
struct drm_i915_gem_object * batch_obj ,
u64 exec_start , u32 flags ) ;
2014-07-24 20:04:36 +04:00
u32 intel_execlists_ctx_id ( struct drm_i915_gem_object * ctx_obj ) ;
2014-07-24 20:04:11 +04:00
2014-07-24 20:04:48 +04:00
/**
* struct intel_ctx_submit_request - queued context submission request
* @ ctx : Context to submit to the ELSP .
* @ ring : Engine to submit it to .
* @ tail : how far in the context ' s ringbuffer this request goes to .
* @ execlist_link : link in the submission queue .
* @ work : workqueue for processing this request in a bottom half .
* @ elsp_submitted : no . of times this request has been sent to the ELSP .
*
* The ELSP only accepts two elements at a time , so we queue context / tail
* pairs on a given queue ( ring - > execlist_queue ) until the hardware is
* available . The queue serves a double purpose : we also use it to keep track
* of the up to 2 contexts currently in the hardware ( usually one in execution
* and the other queued up by the GPU ) : We only remove elements from the head
* of the queue when the hardware informs us that an element has been
* completed .
*
* All accesses to the queue are mediated by a spinlock ( ring - > execlist_lock ) .
*/
2014-07-24 20:04:38 +04:00
struct intel_ctx_submit_request {
struct intel_context * ctx ;
struct intel_engine_cs * ring ;
u32 tail ;
struct list_head execlist_link ;
drm/i915/bdw: Avoid non-lite-restore preemptions
In the current Execlists feeding mechanism, full preemption is not
supported yet: only lite-restores are allowed (this is: the GPU
simply samples a new tail pointer for the context currently in
execution).
But we have identified an scenario in which a full preemption occurs:
1) We submit two contexts for execution (A & B).
2) The GPU finishes with the first one (A), switches to the second one
(B) and informs us.
3) We submit B again (hoping to cause a lite restore) together with C,
but in the time we spend writing to the ELSP, the GPU finishes B.
4) The GPU start executing B again (since we told it so).
5) We receive a B finished interrupt and, mistakenly, we submit C (again)
and D, causing a full preemption of B.
The race is avoided by keeping track of how many times a context has been
submitted to the hardware and by better discriminating the received context
switch interrupts: in the example, when we have submitted B twice, we won´t
submit C and D as soon as we receive the notification that B is completed
because we were expecting to get a LITE_RESTORE and we didn´t, so we know a
second completion will be received shortly.
Without this explicit checking, somehow, the batch buffer execution order
gets messed with. This can be verified with the IGT test I sent together with
the series. I don´t know the exact mechanism by which the pre-emption messes
with the execution order but, since other people is working on the Scheduler
+ Preemption on Execlists, I didn´t try to fix it. In these series, only Lite
Restores are supported (other kind of preemptions WARN).
v2: elsp_submitted belongs in the new intel_ctx_submit_request. Several
rebase changes.
v3: Clarify how the race is avoided, as requested by Daniel.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Align function parameters ...]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-24 20:04:40 +04:00
int elsp_submitted ;
2014-07-24 20:04:38 +04:00
} ;
drm/i915/bdw: Handle context switch events
Handle all context status events in the context status buffer on every
context switch interrupt. We only remove work from the execlist queue
after a context status buffer reports that it has completed and we only
attempt to schedule new contexts on interrupt when a previously submitted
context completes (unless no contexts are queued, which means the GPU is
free).
We canot call intel_runtime_pm_get() in an interrupt (or with a spinlock
grabbed, FWIW), because it might sleep, which is not a nice thing to do.
Instead, do the runtime_pm get/put together with the create/destroy request,
and handle the forcewake get/put directly.
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
v2: Unreferencing the context when we are freeing the request might free
the backing bo, which requires the struct_mutex to be grabbed, so defer
unreferencing and freeing to a bottom half.
v3:
- Ack the interrupt inmediately, before trying to handle it (fix for
missing interrupts by Bob Beckett <robert.beckett@intel.com>).
- Update the Context Status Buffer Read Pointer, just in case (spotted
by Damien Lespiau).
v4: New namespace and multiple rebase changes.
v5: Squash with "drm/i915/bdw: Do not call intel_runtime_pm_get() in an
interrupt", as suggested by Daniel.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Checkpatch ...]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-24 20:04:39 +04:00
void intel_execlists_handle_ctx_events ( struct intel_engine_cs * ring ) ;
2014-11-13 13:27:05 +03:00
void intel_execlists_retire_requests ( struct intel_engine_cs * ring ) ;
drm/i915/bdw: Handle context switch events
Handle all context status events in the context status buffer on every
context switch interrupt. We only remove work from the execlist queue
after a context status buffer reports that it has completed and we only
attempt to schedule new contexts on interrupt when a previously submitted
context completes (unless no contexts are queued, which means the GPU is
free).
We canot call intel_runtime_pm_get() in an interrupt (or with a spinlock
grabbed, FWIW), because it might sleep, which is not a nice thing to do.
Instead, do the runtime_pm get/put together with the create/destroy request,
and handle the forcewake get/put directly.
Signed-off-by: Thomas Daniel <thomas.daniel@intel.com>
v2: Unreferencing the context when we are freeing the request might free
the backing bo, which requires the struct_mutex to be grabbed, so defer
unreferencing and freeing to a bottom half.
v3:
- Ack the interrupt inmediately, before trying to handle it (fix for
missing interrupts by Bob Beckett <robert.beckett@intel.com>).
- Update the Context Status Buffer Read Pointer, just in case (spotted
by Damien Lespiau).
v4: New namespace and multiple rebase changes.
v5: Squash with "drm/i915/bdw: Do not call intel_runtime_pm_get() in an
interrupt", as suggested by Daniel.
Signed-off-by: Oscar Mateo <oscar.mateo@intel.com>
Reviewed-by: Damien Lespiau <damien.lespiau@intel.com>
[danvet: Checkpatch ...]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2014-07-24 20:04:39 +04:00
2014-07-24 20:04:10 +04:00
# endif /* _INTEL_LRC_H_ */