IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
CHIP set bus_width according to the HW configuration, and CORE will use
it as buffer alignment.
v2: Rebase
Signed-off-by: James Qian Wang (Arm Technology China) <james.qian.wang@arm.com>
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Add two sysfs node: core_id, config_id, user can read them to fetch the
HW product information.
Also, use memset to initialize config_id, rather than quirky C syntax.
Courtesy of Nathan Chancellor <natechancellor@gmail.com>.
Signed-off-by: James Qian Wang (Arm Technology China) <james.qian.wang@arm.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
[Merged Nathan's patch that uses memset to initialize config_id into
original patch as the fixes tag changed due to rebase, reworded the
commit to reference the merged patch]
Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>
Asking the GPU to busywait on a memory address, perhaps not unexpectedly
in hindsight for a shared system, leads to bus contention that affects
CPU programs trying to concurrently access memory. This can manifest as
a drop in transcode throughput on highly over-saturated workloads.
The only clue offered by perf, is that the bus-cycles (perf stat -e
bus-cycles) jumped by 50% when enabling semaphores. This corresponds
with extra CPU active cycles being attributed to intel_idle's mwait.
This patch introduces a heuristic to try and detect when more than one
client is submitting to the GPU pushing it into an oversaturated state.
As we already keep track of when the semaphores are signaled, we can
inspect their state on submitting the busywait batch and if we planned
to use a semaphore but were too late, conclude that the GPU is
overloaded and not try to use semaphores in future requests. In
practice, this means we optimistically try to use semaphores for the
first frame of a transcode job split over multiple engines, and fail if
there are multiple clients active and continue not to use semaphores for
the subsequent frames in the sequence. Periodically, we try to
optimistically switch semaphores back on whenever the client waits to
catch up with the transcode results.
With 1 client, on Broxton J3455, with the relative fps normalized by %cpu:
x no semaphores
+ drm-tip
* patched
+------------------------------------------------------------------------+
| * |
| *+ |
| **+ |
| **+ x |
| x * +**+ x |
| x x * * +***x xx |
| x x * * *+***x *x |
| x x* + * * *****x *x x |
| + x xx+x* + *** * ********* x * |
| + x xx+x* * *** +** ********* xx * |
| * + ++++* + x*x****+*+* ***+*************+x* * |
|*+ +** *+ + +* + *++****** *xxx**********x***+*****************+*++ *|
| |__________A_____M_____| |
| |_______________A____M_________| |
| |____________A___M________| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 2.60475 3.50941 3.31123 3.2143953 0.21117399
+ 120 2.3826 3.57077 3.25101 3.1414161 0.28146407
Difference at 95.0% confidence
-0.0729792 +/- 0.0629585
-2.27039% +/- 1.95864%
(Student's t, pooled s = 0.248814)
* 120 2.35536 3.66713 3.2849 3.2059917 0.24618565
No difference proven at 95.0% confidence
With 10 clients over-saturating the pipeline:
x no semaphores
+ drm-tip
* patched
+------------------------------------------------------------------------+
| ++ ** |
| ++ ** |
| ++ ** |
| ++ ** |
| ++ xx *** |
| ++ xx *** |
| ++ xxx*** |
| ++ xxx*** |
| +++ xxx*** |
| +++ xx**** |
| +++ xx**** |
| +++ xx**** |
| +++ xx**** |
| ++++ xx**** |
| +++++ xx**** |
| +++++ x x****** |
| ++++++ xxx******* |
| ++++++ xxx******* |
| ++++++ xxx******* |
| ++++++ xx******** |
| ++++++ xxxx******** |
| ++++++ xxxx******** |
| ++++++++ xxxxx********* |
|+ + + + ++++++++ xxx*xx**********x* *|
| |__A__| |
| |__AM__| |
| |__A_| |
+------------------------------------------------------------------------+
N Min Max Median Avg Stddev
x 120 2.47855 2.8972 2.72376 2.7193402 0.074604933
+ 120 1.17367 1.77459 1.71977 1.6966782 0.085850697
Difference at 95.0% confidence
-1.02266 +/- 0.0203502
-37.607% +/- 0.748352%
(Student's t, pooled s = 0.0804246)
* 120 2.57868 3.00821 2.80142 2.7923878 0.058646477
Difference at 95.0% confidence
0.0730476 +/- 0.0169791
2.68622% +/- 0.624383%
(Student's t, pooled s = 0.0671018)
Indicating that we've recovered the regression from enabling semaphores
on this saturated setup, with a hint towards an overall improvement.
Very similar, but of smaller magnitude, results are observed on both
Skylake(gt2) and Kabylake(gt4). This may be due to the reduced impact of
bus-cycles, where we see a 50% hit on Broxton, it is only 10% on the big
core, in this particular test.
One observation to make here is that for a greedy client trying to
maximise its own throughput, using semaphores is the right choice. It is
only the holistic system-wide view that semaphores of one client
impacts another and reduces the overall throughput where we would choose
to disable semaphores.
The most noticeable negactive impact this has is on the no-op
microbenchmarks, which are also very notable for having no cpu bus load.
In particular, this increases the runtime and energy consumption of
gem_exec_whisper.
Fixes: e88619646971 ("drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Cc: Dmitry Ermilov <dmitry.ermilov@intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190504070707.30902-1-chris@chris-wilson.co.uk
(cherry picked from commit ca6e56f654e7b241256ffba78cd2abb22aa3bc97)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Currently we submit the semaphore busywait as soon as the signaler is
submitted to HW. However, we may submit the signaler as the tail of a
batch of requests, and even not as the first context in the HW list,
i.e. the busywait may start spinning far in advance of the signaler even
starting.
If we wait until the request before the signaler is completed before
submitting the busywait, we prevent the busywait from starting too
early, if the signaler is not first in submission port.
To handle the case where the signaler is at the start of the second (or
later) submission port, we will need to delay the execution callback
until we know the context is promoted to port0. A challenge for later.
Fixes: e88619646971 ("drm/i915: Use HW semaphores for inter-engine synchronisation on gen8+")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190501114541.10077-9-chris@chris-wilson.co.uk
(cherry picked from commit 0d90ccb70211cbf55140e91bd39db684aa4c16e9)
[Joonas: edited Fixes: tag into single line.]
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Remove mmiowb() from the kernel memory barrier API and instead, for
architectures that need it, hide the barrier inside spin_unlock() when
MMIO has been performed inside the critical section.
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAlzMFaUACgkQt6xw3ITB
YzRICQgAiv7wF/yIbBhDOmCNCAKDO59chvFQWxXWdGk/aAB56kwKAMXJgLOvlMG/
VRuuLyParTFQETC3jaxKgnO/1hb+PZLDt2Q2KqixtjIzBypKUPWvK2sf6THhSRF1
GK0DBVUd1rCrWrR815+SPb8el4xXtdBzvAVB+Fx35PXVNpdRdqCkK+EQ6UnXGokm
rXXHbnfsnquBDtmb4CR4r2beH+aNElXbdt0Kj8VcE5J7f7jTdW3z6Q9WFRvdKmK7
yrsxXXB2w/EsWXOwFp0SLTV5+fgeGgTvv8uLjDw+SG6t0E0PebxjNAflT7dPrbYL
WecjKC9WqBxrGY+4ew6YJP70ijLBCw==
=aC8m
-----END PGP SIGNATURE-----
Merge tag 'arm64-mmiowb' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull mmiowb removal from Will Deacon:
"Remove Mysterious Macro Intended to Obscure Weird Behaviours (mmiowb())
Remove mmiowb() from the kernel memory barrier API and instead, for
architectures that need it, hide the barrier inside spin_unlock() when
MMIO has been performed inside the critical section.
The only relatively recent changes have been addressing review
comments on the documentation, which is in a much better shape thanks
to the efforts of Ben and Ingo.
I was initially planning to split this into two pull requests so that
you could run the coccinelle script yourself, however it's been plain
sailing in linux-next so I've just included the whole lot here to keep
things simple"
* tag 'arm64-mmiowb' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (23 commits)
docs/memory-barriers.txt: Update I/O section to be clearer about CPU vs thread
docs/memory-barriers.txt: Fix style, spacing and grammar in I/O section
arch: Remove dummy mmiowb() definitions from arch code
net/ethernet/silan/sc92031: Remove stale comment about mmiowb()
i40iw: Redefine i40iw_mmiowb() to do nothing
scsi/qla1280: Remove stale comment about mmiowb()
drivers: Remove explicit invocations of mmiowb()
drivers: Remove useless trailing comments from mmiowb() invocations
Documentation: Kill all references to mmiowb()
riscv/mmiowb: Hook up mmwiob() implementation to asm-generic code
powerpc/mmiowb: Hook up mmwiob() implementation to asm-generic code
ia64/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
mips/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
sh/mmiowb: Add unconditional mmiowb() to arch_spin_unlock()
m68k/io: Remove useless definition of mmiowb()
nds32/io: Remove useless definition of mmiowb()
x86/io: Remove useless definition of mmiowb()
arm64/io: Remove useless definition of mmiowb()
ARM/io: Remove useless definition of mmiowb()
mmiowb: Hook up mmiowb helpers to spinlocks and generic I/O accessors
...
Pull stack trace updates from Ingo Molnar:
"So Thomas looked at the stacktrace code recently and noticed a few
weirdnesses, and we all know how such stories of crummy kernel code
meeting German engineering perfection end: a 45-patch series to clean
it all up! :-)
Here's the changes in Thomas's words:
'Struct stack_trace is a sinkhole for input and output parameters
which is largely pointless for most usage sites. In fact if embedded
into other data structures it creates indirections and extra storage
overhead for no benefit.
Looking at all usage sites makes it clear that they just require an
interface which is based on a storage array. That array is either on
stack, global or embedded into some other data structure.
Some of the stack depot usage sites are outright wrong, but
fortunately the wrongness just causes more stack being used for
nothing and does not have functional impact.
Another oddity is the inconsistent termination of the stack trace
with ULONG_MAX. It's pointless as the number of entries is what
determines the length of the stored trace. In fact quite some call
sites remove the ULONG_MAX marker afterwards with or without nasty
comments about it. Not all architectures do that and those which do,
do it inconsistenly either conditional on nr_entries == 0 or
unconditionally.
The following series cleans that up by:
1) Removing the ULONG_MAX termination in the architecture code
2) Removing the ULONG_MAX fixups at the call sites
3) Providing plain storage array based interfaces for stacktrace
and stackdepot.
4) Cleaning up the mess at the callsites including some related
cleanups.
5) Removing the struct stack_trace based interfaces
This is not changing the struct stack_trace interfaces at the
architecture level, but it removes the exposure to the generic
code'"
* 'core-stacktrace-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (45 commits)
x86/stacktrace: Use common infrastructure
stacktrace: Provide common infrastructure
lib/stackdepot: Remove obsolete functions
stacktrace: Remove obsolete functions
livepatch: Simplify stack trace retrieval
tracing: Remove the last struct stack_trace usage
tracing: Simplify stack trace retrieval
tracing: Make ftrace_trace_userstack() static and conditional
tracing: Use percpu stack trace buffer more intelligently
tracing: Simplify stacktrace retrieval in histograms
lockdep: Simplify stack trace handling
lockdep: Remove save argument from check_prev_add()
lockdep: Remove unused trace argument from print_circular_bug()
drm: Simplify stacktrace handling
dm persistent data: Simplify stack trace handling
dm bufio: Simplify stack trace retrieval
btrfs: ref-verify: Simplify stack trace retrieval
dma/debug: Simplify stracktrace retrieval
fault-inject: Simplify stacktrace retrieval
mm/page_owner: Simplify stack trace handling
...
Pull objtool updates from Ingo Molnar:
"This is a series from Peter Zijlstra that adds x86 build-time uaccess
validation of SMAP to objtool, which will detect and warn about the
following uaccess API usage bugs and weirdnesses:
- call to %s() with UACCESS enabled
- return with UACCESS enabled
- return with UACCESS disabled from a UACCESS-safe function
- recursive UACCESS enable
- redundant UACCESS disable
- UACCESS-safe disables UACCESS
As it turns out not leaking uaccess permissions outside the intended
uaccess functionality is hard when the interfaces are complex and when
such bugs are mostly dormant.
As a bonus we now also check the DF flag. We had at least one
high-profile bug in that area in the early days of Linux, and the
checking is fairly simple. The checks performed and warnings emitted
are:
- call to %s() with DF set
- return with DF set
- return with modified stack frame
- recursive STD
- redundant CLD
It's all x86-only for now, but later on this can also be used for PAN
on ARM and objtool is fairly cross-platform in principle.
While all warnings emitted by this new checking facility that got
reported to us were fixed, there might be GCC version dependent
warnings that were not reported yet - which we'll address, should they
trigger.
The warnings are non-fatal build warnings"
* 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
mm/uaccess: Use 'unsigned long' to placate UBSAN warnings on older GCC versions
x86/uaccess: Dont leak the AC flag into __put_user() argument evaluation
sched/x86_64: Don't save flags on context switch
objtool: Add Direction Flag validation
objtool: Add UACCESS validation
objtool: Fix sibling call detection
objtool: Rewrite alt->skip_orig
objtool: Add --backtrace support
objtool: Rewrite add_ignores()
objtool: Handle function aliases
objtool: Set insn->func for alternatives
x86/uaccess, kcov: Disable stack protector
x86/uaccess, ftrace: Fix ftrace_likely_update() vs. SMAP
x86/uaccess, ubsan: Fix UBSAN vs. SMAP
x86/uaccess, kasan: Fix KASAN vs SMAP
x86/smap: Ditch __stringify()
x86/uaccess: Introduce user_access_{save,restore}()
x86/uaccess, signal: Fix AC=1 bloat
x86/uaccess: Always inline user_access_begin()
x86/uaccess, xen: Suppress SMAP warnings
...
[Why]
The type of 'r' is uint32_t and the return codes for both:
- reservation_object_wait_timeout_rcu
- amdgpu_bo_reserve
...are signed. While it works for the latter since the check is
done on != 0 it doesn't work for the former since we check <= 0.
[How]
Make 'r' a long in commit planes so we're not doing any unsigned/signed
conversion here in the first place.
v2: use long instead of int (Christian)
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SR-IOV host side will send IDH_QUERY_ALIVE to guest VM to check
if this guest VM is still alive (not destroyed). The only thing
guest KMD need to do is to send ACK back to host.
Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
amdgpu_vm_make_compute is used to turn a GFX VM into a compute VM,
the prerequisite is this VM is clean. Let's check if some page tables
are already filled , while not check if some mapping is already made.
Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In Multi-VFs stress test, sometimes we see IRQ lost when running
benchmark, just rearm it.
Signed-off-by: Trigger Huang <Trigger.Huang@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
In amdgpu_atif_handler, when hotplug event received, remove
ATPX_DGPU_REQ_POWER_FOR_DISPLAYS check. This bit's check will cause missing
system resume.
Signed-off-by: Aaron Liu <aaron.liu@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Cc: stable@vger.kernel.org
Depends on GEN family and I915_PARAM_HAS_CONTEXT_ISOLATION, Mesa driver
will decide whether constant buffer 0 address is relative or absolute,
and load GPU initial state by lri to context mmio INSTPM (GEN8)
or 0x20D8 (>=GEN9).
Mesa Commit fa8a764b62
("i965: Use absolute addressing for constant buffer 0 on Kernel 4.16+.")
INSTPM is already added to gen8_engine_mmio_list, but 0x20D8 is missed
in gen9_engine_mmio_list. From GVT point of view, different guest could
have different context so should switch those mmio accordingly.
v2: Update fixes commit ID.
Fixes: 178657139307 ("drm/i915/gvt: vGPU context switch")
Reviewed-by: Zhenyu Wang <zhenyuw@linux.intel.com>
Signed-off-by: Colin Xu <colin.xu@intel.com>
Signed-off-by: Zhenyu Wang <zhenyuw@linux.intel.com>
(cherry picked from commit 1e8b15a1988ed3c7429402017d589422628cdf47)
The DMA masks need to be set correctly before any DMA API activity kicks
off, and the current point in panfrost_probe() is way too late in that
regard. since panfrost_mmu_init() has already set up a live address
space and DMA-mapped MMU pagetables. We can't set masks until we've
queried the appropriate value from MMU_FEATURES, but as soon as
reasonably possible after that should suffice.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Link: https://patchwork.freedesktop.org/patch/msgid/64361b929a5c61d2ab9580262ecb3d369164cfcb.1556195258.git.robin.murphy@arm.com
-misc-next-fixes.
Signed-off-by: Sean Paul <seanpaul@chromium.org>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEfxcpfMSgdnQMs+QqlvcN/ahKBwoFAlzMmCIACgkQlvcN/ahK
BwqHEwf+LNISngz8pjHWljiaOQGfMHhE2LHR55C4ncF8bFUD82k7zVwxEm0Ls7b9
xnmoGGTnG46viUoZDcZsWcy/BkEYbEQ7aTsaexDJ1w9SeMDh0OKrlPlsSJjKXDGL
mUM8zVaRnKuFCwTmh3O0GPLV2iNyTwjPiIXBwcif2/ZLWpCrzwZqv4bgiiSo+eOD
+12qL9h+MdlTWXvLCroMe44o2QNnKukkkGUAaT/oQloJPhe8hzEU9WloY+5VgGaH
IZ0mPcDz5QjU6OflmHrH6XSpcp5UdRsgks7ws9+zdACiPeS/XNgpnqPSIi/EXGtK
vgIkSe0SAFH87iRihjN+5bHEhK9fug==
=c8P8
-----END PGP SIGNATURE-----
Merge panfrost-fixes into drm-misc-next-fixes
Merging some panfrost fixes as well as one rockchip fix that _just_
missed feature freeze.
Signed-off-by: Sean Paul <seanpaul@chromium.org>
If there is a match in the HW DB, the function is left early, before
inititalizing the idle mask. Fix this by doing the init earlier, as
only old GPUs, not present in the HW DB need a different idle mask.
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Philipp Zabel <p.zabel@pengutronix.de>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRcEzekXsqa64kGDp7j7w1vZxhRxQUCXMrhnAAKCRDj7w1vZxhR
xZAEAQCAosKQv+TcDiykmghLMlhz/pj+T4I1GAnQnDOUO7/GWAEAuVtky5yMlJhP
g6lpz+Xlc/8lQxIRxtST0QnoQOS5kQg=
=6Pe3
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-fixes-2019-05-02' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
- One revert for QXL for a DRI3 breakage
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Maxime Ripard <maxime.ripard@bootlin.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190502122529.hguztj3kncaixe3d@flea
Unfortunately userspace users of this API cannot be publicly disclosed
yet.
This commit effectively disables timeline syncobj ioctls for all
drivers. Each driver wishing to support this feature will need to
expose DRIVER_SYNCOBJ_TIMELINE.
v2: Add uAPI capability check (Christian)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Christian König <christian.koenig@amd.com> (v1)
Cc: Dave Airlie <airlied@redhat.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Christian König <christian.koenig@amd.com>
Cc: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416125750.31370-1-lionel.g.landwerlin@intel.com
We've been somewhat inconsistent when adding the new ioctl and
returned ENODEV instead of EOPNOTSUPPORTED upon failing the syncobj
capibility.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: ea569910cbab98 ("drm/syncobj: add transition iotcls between binary and timeline v2")
Fixes: 01d6c357837918 ("drm/syncobj: add support for timeline point wait v8")
Cc: Dave Airlie <airlied@redhat.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Chunming Zhou <david1.zhou@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com> for the series.
Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190416123048.2913-1-lionel.g.landwerlin@intel.com
sphinx: squash warning (Sean)
Cc: Philip Yang <Philip.Yang@amd.com>
Cc: Sean Paul <seanpaul@chromium.org>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCgAdFiEEfxcpfMSgdnQMs+QqlvcN/ahKBwoFAlzJ7o0ACgkQlvcN/ahK
Bwq3Dgf/SOIjcR6Kxz6f6M2qg0PfsVpHdl7P8mtaofg6DoLsmdUeYP+D6h+eINMj
rNev6QjH7NXTS9Z17Z6hTVMUkYC9pi6LlkXx/4BYF6lPnt7Jy3h4qkBch8PshtBJ
ijmrv08EEKMrEdJKKpLQ0XsBiq5AzEN3kX+Lxlic29glvmpwpkSyQomGPjDUcmBd
/nVZRc9fdHS2qU7sl0nROqm4u5tBSGyV+uDp9Jpjobr45ptJd9l23jY2/JFn3cCk
pyGGbHE67y67iqofvwZWCx4IW0zCoAHmo4TQiXGdDsYV9Be8vLajK1ui63aw1ZD4
5s/N82FgIC9cy6inGsNG1SKDMJamcQ==
=2x7W
-----END PGP SIGNATURE-----
Merge tag 'drm-misc-next-fixes-2019-05-01' of git://anongit.freedesktop.org/drm/drm-misc into drm-next
core: restore drm mmap_range size back to 1TB (Philip)
sphinx: squash warning (Sean)
Cc: Philip Yang <Philip.Yang@amd.com>
Cc: Sean Paul <seanpaul@chromium.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20190501190921.GA120430@art_vandelay
On a failed resume we may experience unrecoverable errors. Plumb the error code
through to actually let the driver fail. On a reverse-prime setup this helps the
drm subsystem to at least recover the integrated gpu.
This can especially happen with secboot timing out, leaving the hardware in a
non-functioning state.
Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
There is a spelling mistake in a nvkm_debug message. Fix it.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Reviewed-by: Mukesh Ojha <mojha@codeaurora.org>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
For a while, we've had the problem of i2c bus access not grabbing
a runtime PM ref when it's being used in userspace by i2c-dev, resulting
in nouveau spamming the kernel log with errors if anything attempts to
access the i2c bus while the GPU is in runtime suspend. An example:
[ 130.078386] nouveau 0000:01:00.0: i2c: aux 000d: begin idle timeout ffffffff
Since the GPU is in runtime suspend, the MMIO region that the i2c bus is
on isn't accessible. On x86, the standard behavior for accessing an
unavailable MMIO region is to just return ~0.
Except, that turned out to be a lie. While computers with a clean
concious will return ~0 in this scenario, some machines will actually
completely hang a CPU on certian bad MMIO accesses. This was witnessed
with someone's Lenovo ThinkPad P50, where sensors-detect attempting to
access the i2c bus while the GPU was suspended would result in a CPU
hang:
CPU: 5 PID: 12438 Comm: sensors-detect Not tainted 5.0.0-0.rc4.git3.1.fc30.x86_64 #1
Hardware name: LENOVO 20EQS64N17/20EQS64N17, BIOS N1EET74W (1.47 ) 11/21/2017
RIP: 0010:ioread32+0x2b/0x30
Code: 81 ff ff ff 03 00 77 20 48 81 ff 00 00 01 00 76 05 0f b7 d7 ed c3
48 c7 c6 e1 0c 36 96 e8 2d ff ff ff b8 ff ff ff ff c3 8b 07 <c3> 0f 1f
40 00 49 89 f0 48 81 fe ff ff 03 00 76 04 40 88 3e c3 48
RSP: 0018:ffffaac3c5007b48 EFLAGS: 00000292 ORIG_RAX: ffffffffffffff13
RAX: 0000000001111000 RBX: 0000000001111000 RCX: 0000043017a97186
RDX: 0000000000000aaa RSI: 0000000000000005 RDI: ffffaac3c400e4e4
RBP: ffff9e6443902c00 R08: ffffaac3c400e4e4 R09: ffffaac3c5007be7
R10: 0000000000000004 R11: 0000000000000001 R12: ffff9e6445dd0000
R13: 000000000000e4e4 R14: 00000000000003c4 R15: 0000000000000000
FS: 00007f253155a740(0000) GS:ffff9e644f600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00005630d1500358 CR3: 0000000417c44006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
g94_i2c_aux_xfer+0x326/0x850 [nouveau]
nvkm_i2c_aux_i2c_xfer+0x9e/0x140 [nouveau]
__i2c_transfer+0x14b/0x620
i2c_smbus_xfer_emulated+0x159/0x680
? _raw_spin_unlock_irqrestore+0x1/0x60
? rt_mutex_slowlock.constprop.0+0x13d/0x1e0
? __lock_is_held+0x59/0xa0
__i2c_smbus_xfer+0x138/0x5a0
i2c_smbus_xfer+0x4f/0x80
i2cdev_ioctl_smbus+0x162/0x2d0 [i2c_dev]
i2cdev_ioctl+0x1db/0x2c0 [i2c_dev]
do_vfs_ioctl+0x408/0x750
ksys_ioctl+0x5e/0x90
__x64_sys_ioctl+0x16/0x20
do_syscall_64+0x60/0x1e0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f25317f546b
Code: 0f 1e fa 48 8b 05 1d da 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff
ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01
f0 ff ff 73 01 c3 48 8b 0d ed d9 0c 00 f7 d8 64 89 01 48
RSP: 002b:00007ffc88caab68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
RAX: ffffffffffffffda RBX: 00005630d0fe7260 RCX: 00007f25317f546b
RDX: 00005630d1598e80 RSI: 0000000000000720 RDI: 0000000000000003
RBP: 00005630d155b968 R08: 0000000000000001 R09: 00005630d15a1da0
R10: 0000000000000070 R11: 0000000000000246 R12: 00005630d1598e80
R13: 00005630d12f3d28 R14: 0000000000000720 R15: 00005630d12f3ce0
watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [sensors-detect:12438]
Yikes! While I wanted to try to make it so that accessing an i2c bus on
nouveau would wake up the GPU as needed, airlied pointed out that pretty
much any usecase for userspace accessing an i2c bus on a GPU (mainly for
the DDC brightness control that some displays have) is going to only be
useful while there's at least one display enabled on the GPU anyway, and
the GPU never sleeps while there's displays running.
Since teaching the i2c bus to wake up the GPU on userspace accesses is a
good deal more difficult than it might seem, mostly due to the fact that
we have to use the i2c bus during runtime resume of the GPU, we instead
opt for the easiest solution: don't let userspace access i2c busses on
the GPU at all while it's in runtime suspend.
Changes since v1:
* Also disable i2c busses that run over DP AUX
Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Commit 3a6536c51d5d ("drm/nouveau: Intercept ACPI_VIDEO_NOTIFY_PROBE")
added a definition of ACPI_VIDEO_NOTIFY_PROBE because <acpi/video.h> didn't
supply one. Later, commit eff4a751cce5 ("ACPI / video: Move
ACPI_VIDEO_NOTIFY_* defines to acpi/video.h") moved ACPI_VIDEO_NOTIFY_PROBE
and other definitions to <acpi/video.h>, so the copy in nouveau_display.c
is now unnecessary.
Remove the unnecessary definition from nouveau_display.c.
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
CC: Hans de Goede <hdegoede@redhat.com>
Acked-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
If the BAR initialization failed it may leave the vmm structure in an
unitialized state, leading to a null-pointer-dereference when the vmm is
dereferenced during teardown.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
If the BAR is zero size, it indicates it was never successfully mapped.
Ensure that the BAR is valid during initialization before attempting to
use it.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
If the BAR is zero size, it indicates it was never successfully mapped.
Ensure that the BAR is valid during initialization before attempting to
use it.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
Check bar1's new vmm creation return value for errors.
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
This reverts commit f4c34b1e2a37d5676180901fa6ff188bcb6371f8.
Simliar to commit a0cecc23cfcb Revert "drm/virtio: drop prime
import/export callbacks". We have to do the same with qxl,
for the same reasons (it breaks DRI3).
Drop the WARN_ON_ONCE().
Fixes: f4c34b1e2a37d5676 ("drm/qxl: drop prime import/export callbacks")
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Link: http://patchwork.freedesktop.org/patch/msgid/20190426053324.26443-1-kraxel@redhat.com
Acked-by: Daniel Vetter <daniel@ffwll.ch>
WaEnableStateCacheRedirectToCS context workaround configures the L3 cache
to benefit 3d workloads but media has different requirements.
Remove the workaround and whitelist the register to allow any userspace
configure the behaviour to their liking.
v2:
* Remove the workaround apart from adding the whitelist.
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: kevin.ma@intel.com
Cc: xiaogang.li@intel.com
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Anuj Phogat <anuj.phogat@gmail.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20190418100634.984-1-tvrtko.ursulin@linux.intel.com
Fixes: f63c7b4880aa ("drm/i915/icl: WaEnableStateCacheRedirectToCS")
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
[tursulin: Anuj reported no GPU hangs or performance regressions with old
Mesa on patched kernel.]
(cherry picked from commit 0fc2273b9ab7f07cdef448e99525e481535e1ab0)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Use new helper pci_dev_id() to simplify the code.
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Christian König <christian.koenig@amd.com>
Power down the engine also along with disabling its DPM
functionality.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
SMU will use this interface to power down the VCE engine.
Signed-off-by: Evan Quan <evan.quan@amd.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Pre-DCE12 needs special treatment for BTR / low framerate
compensation for more stable behaviour:
According to comments in the code and some testing on DCE-8
and DCE-11, DCE-11 and earlier only apply VTOTAL_MIN/MAX
programming with a lag of one frame, so the special BTR hw
programming for intermediate fixed duration frames must be
done inside the current frame at flip submission in atomic
commit tail, ie. one vblank earlier, and the fixed refresh
intermediate frame mode must be also terminated one vblank
earlier on pre-DCE12 display engines.
To achieve proper termination on < DCE-12 shift the point
when the switch-back from fixed vblank duration to variable
vblank duration happens from the start of VBLANK (vblank irq,
as done on DCE-12+) to back-porch or end of VBLANK (handled
by vupdate irq handler). We must leave the switch-back code
inside VBLANK irq for DCE12+, as before.
Doing this, we get much better behaviour of BTR for up-sweeps,
ie. going from short to long frame durations (~high to low fps)
and for constant framerate flips, as tested on DCE-8 and
DCE-11. Behaviour is still not quite as good as on DCN-1
though.
On down-sweeps, going from long to short frame durations
(low fps to high fps) < DCE-12 is a little bit improved,
although by far not as much as for up-sweeps and constant
fps.
v2: Fix some wrong locking, as pointed out by Nicholas.
v3: Simplify if-condition in vupdate-irq - nit by Nicholas.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
The comparison of inserted_frame_duration_in_us against a
duration calculated from max_refresh_in_uhz is both wrong
in its math and not needed, as the min_duration_in_us value
is already cached in in_out_vrr for reuse. No need to
recalculate it wrongly at each invocation.
Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reviewed-by: Nicholas Kazlauskas <Nicholas.Kazlauskas@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
RGB565 support isn't restricted to just the primary plane in DC, so
also expose support for it on overlays.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: David Francis <david.francis@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
[Why]
Originally we did the amdgpu_dm_handle_vrr_transition call before
interrupts were enabled. After the interrupt toggling logic was
moved around for support enabling CRTCs with no primary planes
active this was no longer being called in the case where there
wasn't a modeset.
This fixes failures in igt@kms_vrr@* with error
"Timed out: Waiting for vblank event".
[How]
Shift them back into the loop that always ran before interrupts were
enabled.
Pull out the logic that updated VRR state into the same loop since
there's no reason these need to be split.
In the case where we're going from VRR off, no planes to VRR on, some
active planes we'll still be covered for having the VRR vupdate
handler enabled - vblank will be re-enabled at this point, it will
see that VRR is active and set the vupdate interrupt on there.
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Reviewed-by: David Francis <David.Francis@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
Refactor dp vendor parsing int to a new function, and call it before
get_active_converter_info().
Also, add a flag to skip parsing of Display ID 2.0. Some devices fail on
readind DID2, but we shouldn't fail EDID read because of it. Add this
flag to facilitate the logic.
Signed-off-by: John Barberiz <John.Barberiz@amd.com>
Reviewed-by: Aric Cyr <Aric.Cyr@amd.com>
Acked-by: Leo Li <sunpeng.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>