IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Define IS_GEN() similarly to our IS_GEN_RANGE(). but use gen instead of
gen_mask to do the comparison. Now callers can pass then gen as a parameter,
so we don't require one macro for each gen.
The following spatch was used to convert the users of these macros:
@@
expression e;
@@
(
- IS_GEN2(e)
+ IS_GEN(e, 2)
|
- IS_GEN3(e)
+ IS_GEN(e, 3)
|
- IS_GEN4(e)
+ IS_GEN(e, 4)
|
- IS_GEN5(e)
+ IS_GEN(e, 5)
|
- IS_GEN6(e)
+ IS_GEN(e, 6)
|
- IS_GEN7(e)
+ IS_GEN(e, 7)
|
- IS_GEN8(e)
+ IS_GEN(e, 8)
|
- IS_GEN9(e)
+ IS_GEN(e, 9)
|
- IS_GEN10(e)
+ IS_GEN(e, 10)
|
- IS_GEN11(e)
+ IS_GEN(e, 11)
)
v2: use IS_GEN rather than GT_GEN and compare to info.gen rather than
using the bitmask
Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181212181044.15886-2-lucas.demarchi@intel.com
The DDB allocation algorithm currently used by the driver grants each
plane a very small minimum allocation of DDB blocks and then divies up
all of the remaining blocks based on the percentage of the total data
rate that the plane makes up. It turns out that this proportional
allocation approach is overly-generous with the larger planes and can
leave very small planes wthout a big enough allocation to even hit their
level 0 watermark requirements (especially on APL, which has a smaller
DDB in general than other gen9 platforms). Or there can be situations
where the smallest planes hit a lower watermark level than they should
have been able to hit with a more equitable division of DDB blocks, thus
limiting the overall system sleep state that can be achieved.
The bspec now describes an alternate algorithm that can be used to
overcome these types of issues. With the new algorithm, we calculate
all plane watermark values for all wm levels first, then go back and
partition a pipe's DDB space second. The DDB allocation will calculate
what the highest watermark level that can be achieved on *all* active
planes, and then grant the blocks necessary to hit that level to each
plane. Any remaining blocks are then divided up proportionally
according to data rate, similar to the old algorithm.
There was a previous attempt to implement this algorithm a couple years
ago in bb9d85f6e9 ("drm/i915/skl: New ddb allocation algorithm"), but
some regressions were reported, the patch was reverted, and nobody
ever got around to figuring out exactly where the bug was in that
version. Our watermark code has evolved significantly in the meantime,
but we're still getting bug reports caused by the unfair proportional
algorithm, so let's give this another shot.
v2:
- Make sure cursor allocation stays constant and fixed at the end of
the pipe allocation.
- Fix some watermark level iterators that weren't handling the max
level.
v3:
- Ensure we don't leave any DDB blocks unused by using DIV_ROUND_UP+min
to calculate the extra blocks for each plane. (Ville)
- Replace a while() loop with a for() loop to be more consistent with
surrounding code. (Ville)
- Clean unattainable watermark levels with memset rather than directly
clearing the member fields. Also do the same for the transition
watermark values if they can't be achieved. (Ville)
- Drop min_disp_buf_needed calculations in skl_compute_plane_wm() since
the results are no longer needed or used. (Ville)
- Drop skl_latency[0] != 0 sanity check; both watermark methods already
account for an invalid 0 latency by returning FP_16_16_MAX. (Ville)
v4:
- Break DDB allocation loop when total_data_rate=0 rather than
alloc_size=0. If total_data_rate has dropped to 0, all remaining
planes are disabled, which isn't true for alloc_size (we might just
have not had any remaining blocks to hand out). Plus
total_data_rate=0 is the case we need to avoid to a prevent a
div-by-0. (Ville)
- s/DIV_ROUND_UP/DIV64_U64_ROUND_UP/ to prevent 32-bit breakage (Ville)
v5:
- Don't forget to move 'start' pointer forward for UV surface when
setting plane DDB boundaries. (Ville)
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105458
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181211173107.11068-2-matthew.d.roper@intel.com
The bspec gives an if/else chain for choosing whether to use "method 1"
or "method 2" for calculating the watermark "Selected Result Blocks"
value for a plane. One of the branches of the if chain is:
"Else If ('plane buffer allocation' is known and (plane buffer
allocation / plane blocks per line) >=1)"
Since our driver currently calculates DDB allocations first and the
actual watermark values second, the plane buffer allocation is known at
this point in our code and we include this test in our driver's logic.
However we plan to soon move to a "watermarks first, ddb allocation
second" sequence where we won't know the DDB allocation at this point.
Let's drop this arm of the if/else statement (effectively considering
the DDB allocation unknown) as an independent patch so that any
regressions can be more accurately bisected to either the different
watermark value (in this patch) or the new DDB allocation (in the next
patch).
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181211173107.11068-1-matthew.d.roper@intel.com
On SKL+ the plane WM/BUF_CFG registers are a proper part of each
plane's register set. That means accessing them will cancel any
pending plane update, and we would need a PLANE_SURF register write
to arm the wm/ddb change as well.
To avoid all the problems with that let's just move the wm/ddb
programming into the plane update/disable hooks. Now all plane
registers get written in one (hopefully atomic) operation.
To make that feasible we'll move the plane ddb tracking into
the crtc state. Watermarks were already tracked there.
v2: Rebase due to input CSC
v3: Split out a bunch of junk (Matt)
v4: Add skl_wm_add_affected_planes() to deal with
cursor special case and non-zero wm register reset value
v5: Drop the unrelated for_each_intel_plane_mask() fix (Matt)
Remove the redundant ddb memset() (Matt)
Cc: Matt Roper <matthew.d.roper@intel.com>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> #v3
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181127165900.31298-1-ville.syrjala@linux.intel.com
I have a Thinkpad X220 Tablet in my hands that is losing vblank
interrupts whenever LP3 watermarks are used.
If I nudge the latency value written to the WM3 register just
by one in either direction the problem disappears. That to me
suggests that the punit will not enter the corrsponding
powersave mode (MPLL shutdown IIRC) unless the latency value
in the register matches exactly what we read from SSKPD. Ie.
it's not really a latency value but rather just a cookie
by which the punit can identify the desired power saving state.
On HSW/BDW this was changed such that we actually just write
the WM level number into those bits, which makes much more
sense given the observed behaviour.
We could try to handle this by disallowing LP3 watermarks
only when vblank interrupts are enabled but we'd first have
to prove that only vblank interrupts are affected, which
seems unlikely. Also we can't grab the wm mutex from the
vblank enable/disable hooks because those are called with
various spinlocks held. Thus we'd have to redesigne the
watermark locking. So to play it safe and keep the code
simple we simply disable LP3 watermarks on all SNB machines.
To do that we simply zero out the latency values for
watermark level 3, and we adjust the watermark computation
to check for that. The behaviour now matches that of the
g4x/vlv/skl wm code in the presence of a zeroed latency
value.
v2: s/USHRT_MAX/U32_MAX/ for consistency with the types (Chris)
Cc: stable@vger.kernel.org
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Acked-by: Chris Wilson <chris@chris-wilson.co.uk>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101269
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103713
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181114173440.6730-1-ville.syrjala@linux.intel.com
Skylake style watermarks program the UV parameters into wm->uv_wm,
and have a separate DDB allocation for UV blocks into the same plane.
Gen11 watermarks have a separate plane for Y and UV, with separate
mechanisms. The simplest way to make it work is to keep the current
way of programming watermarks and calculate the Y and UV plane
watermarks from the master plane.
Changes since v1:
- Constify crtc_state where possible.
- Make separate paths for planar formats in skl_build_pipe_wm() (Matt)
- Make separate paths for calculating total data rate. (Matt)
- Make sure UV watermarks are unused on gen11+ by adding a WARN. (Matt)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181018115134.9061-5-maarten.lankhorst@linux.intel.com
To make NV12 working on icl, we need to update 2 planes simultaneously.
I've chosen to do this in the CRTC step after plane validation is done,
so we know what planes are (in)visible. The linked Y plane will get
updated in intel_plane_update_planes_on_crtc(), by the call to
update_slave, which gets the master's plane_state as argument.
The link requires both planes for atomic_update to work,
so make sure skl_ddb_add_affected_planes() adds both states.
Changes since v1:
- Introduce icl_is_nv12_y_plane(), instead of hardcoding sprite numbers.
- Put all the state updating login in intel_plane_atomic_check_with_state().
- Clean up changes in intel_plane_atomic_check().
Changes since v2:
- Fix intel_atomic_get_old_plane_state() to actually return old state.
- Move visibility changes to preparation patch.
- Only try to find a Y plane on gen11, earlier platforms only require
a single plane.
Changes since v3:
- Fix checkpatch warning about to_intel_crtc() usage.
- Add affected planes from icl_add_linked_planes() before check_planes(),
it's a cleaner way to do this. (Ville)
Changes since v4:
- Clear plane links in icl_check_nv12_planes() for clarity.
- Only pass crtc_state to icl_check_nv12_planes().
- Use for_each_new_intel_plane_in_state() in icl_check_nv12_planes.
- Rename aux to linked. (Ville)
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20181022135152.15324-1-maarten.lankhorst@linux.intel.com
[mlankhorst: Change bool slave to u32, to satisfy checkpatch]
[mlankhorst: Add WARN_ON's based on Ville's suggestion]
We don't have proper watermark NV12 support on ICL due to differences
in how it should be implemented. In commit 234059da0f
("drm/i915/icl: NV12 y-plane ddb is not in same plane") we avoided
writing the non-existent PLANE_NV12_BUF_CFG registers but we forgot to
also avoid them on the hardware state readout. While the code is still
not correct, at least now we can avoid unclaimed register error
messages when dealing with RGB formats, which makes CI happier.
Also add some FIXME comments in order to make it even more clear that
there's still work to do.
References: commit 234059da0f ("drm/i915/icl: NV12 y-plane ddb is
not in same plane")
Cc: Mahesh Kumar <mahesh1.kumar@intel.com>
Reviewed-by: Mahesh Kumar <mahesh1.kumar@intel.com>
Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180801004614.22149-1-paulo.r.zanoni@intel.com
RPS provides a feedback loop where we use the load during the previous
evaluation interval to decide whether to up or down clock the GPU
frequency. Our responsiveness is split into 3 regimes, a high and low
plateau with the intent to keep the gpu clocked high to cover occasional
stalls under high load, and low despite occasional glitches under steady
low load, and inbetween. However, we run into situations like kodi where
we want to stay at low power (video decoding is done efficiently
inside the fixed function HW and doesn't need high clocks even for high
bitrate streams), but just occasionally the pipeline is more complex
than a video decode and we need a smidgen of extra GPU power to present
on time. In the high power regime, we sample at sub frame intervals with
a bias to upclocking, and conversely at low power we sample over a few
frames worth to provide what we consider to be the right levels of
responsiveness respectively. At low power, we more or less expect to be
kicked out to high power at the start of a busy sequence by waitboosting.
Prior to commit e9af4ea2b9 ("drm/i915: Avoid waitboosting on the active
request") whenever we missed the frame or stalled, we would immediate go
full throttle and upclock the GPU to max. But in commit e9af4ea2b9, we
relaxed the waitboosting to only apply if the pipeline was deep to avoid
over-committing resources for a near miss. Sadly though, a near miss is
still a miss, and perceptible as jitter in the frame delivery.
To try and prevent the near miss before having to resort to boosting
after the fact, we use the pageflip queue as an indication that we are
in an "interactive" regime and so should sample the load more frequently
to provide power before the frame misses it vblank. This will make us
more favorable to providing a small power increase (one or two bins) as
required rather than going all the way to maximum and then having to
work back down again. (We still keep the waitboosting mechanism around
just in case a dramatic change in system load requires urgent uplocking,
faster than we can provide in a few evaluation intervals.)
v2: Reduce rps_set_interactive to a boolean parameter to avoid the
confusion of what if they wanted a new power mode after pinning to a
different mode (which to choose?)
v3: Only reprogram RPS while the GT is awake, it will be set when we
wake the GT, and while off warns about being used outside of rpm.
v4: Fix deferred application of interactive mode
v5: s/state/interactive/
v6: Group the mutex with its principle in a substruct
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107111
Fixes: e9af4ea2b9 ("drm/i915: Avoid waitboosting on the active request")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Radoslaw Szwichtenberg <radoslaw.szwichtenberg@intel.com>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180731132629.3381-1-chris@chris-wilson.co.uk