linux

iv/linux

Author	SHA1	Message	Date
Dmitry Osipenko	43240bbd87	gpu: host1x: At first try a non-blocking allocation for the gather copy The blocking gather copy allocation is a major performance downside of the Host1x firewall, it may take hundreds milliseconds which is unacceptable for the real-time graphics operations. Let's try a non-blocking allocation first as a least invasive solution, it makes opentegra (Xorg driver) performance indistinguishable with/without the firewall. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:25:56 +02:00
Mikko Perttunen	8474b02531	gpu: host1x: Refactor channel allocation code This is largely a rewrite of the Host1x channel allocation code, bringing several changes: - The previous code could deadlock due to an interaction between the 'reflock' mutex and CDMA timeout handling. This gets rid of the mutex. - Support for more than 32 channels, required for Tegra186 - General refactoring, including better encapsulation of channel ownership handling into channel.c Signed-off-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Dmitry Osipenko <digetx@gmail.com> Tested-by: Dmitry Osipenko <digetx@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:25:38 +02:00
Dmitry Osipenko	03f0de770e	gpu: host1x: Remove unused host1x_cdma_stop() definition There is no host1x_cdma_stop() in the code, let's remove its definition from the header file. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:25:18 +02:00
Dmitry Osipenko	03ebcaa3de	gpu: host1x: Remove unused 'struct host1x_cmdbuf' The struct host1x_cmdbuf is unused, let's remove it. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:24:59 +02:00
Dmitry Osipenko	a47ac10e6e	gpu: host1x: Check waits in the firewall Check waits in the firewall in a way it is done for relocations. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:24:41 +02:00
Ville Syrjälä	da1d0e2655	drm/i915: Plumb the correct acquire ctx into intel_crtc_disable_noatomic() If intel_crtc_disable_noatomic() were to ever get called during resume we'd end up deadlocking since resume has its own acqcuire_ctx but intel_crtc_disable_noatomic() still tries to use the mode_config.acquire_ctx. Pass down the correct acquire ctx from the top. Cc: stable@vger.kernel.org Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: e2c8b8701e2d ("drm/i915: Use atomic helpers for suspend, v2.") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-3-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2017-06-15 15:24:00 +03:00
Dmitry Osipenko	0f563a4bf6	gpu: host1x: Forbid unrelated SETCLASS opcode in the firewall Several channels could be made to write the same unit concurrently via the SETCLASS opcode, trusting userspace is a bad idea. It should be possible to drop the per-client channel reservation and add a per-unit locking by inserting MLOCK's to the command stream to re-allow the SETCLASS opcode, but it will be much more work. Let's forbid the unit-unrelated class changes for now. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:23:50 +02:00
Dmitry Osipenko	ef81624994	gpu: host1x: Forbid RESTART opcode in the firewall The RESTART opcode terminates the gather and restarts the CDMA fetching from a specified word << 2 relative to the CDMA start address. That shouldn't be allowed to be done by userspace. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:23:18 +02:00
Dmitry Osipenko	571cbf70c1	gpu: host1x: Forbid relocation address shifting in the firewall Incorrectly shifted relocation address will cause a lower memory corruption and likely a hang on a write or a read of an arbitrary data in case of IOMMU absence. As of now, there is no known use for the address shifting and adding a proper shifts / sizes validation is a much more work. Let's forbid shifts in the firewall till a proper validation is implemented. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:22:32 +02:00
Dmitry Osipenko	47f89c10dd	gpu: host1x: Do not leak BO's phys address to userspace Perform gathers coping before patching them, so that original gathers are left untouched. That's not as bad as leaking kernel addresses, but still doesn't feel right. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:22:03 +02:00
Dmitry Osipenko	e5855aa3e6	gpu: host1x: Correct host1x_job_pin() error handling In case of relocations / waitchecks patching failure the jobs pins stay referenced till DRM file get closed, wasting memory. Add the missed unpinning. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:21:46 +02:00
Dmitry Osipenko	3833d16f16	gpu: host1x: Initialize firewall class to the job's one The commands stream is prepended by the jobs class on the CDMA submission, so that explicitly setting a module class in the commands stream isn't necessary. The firewall initializes its class to 0 and the command stream that doesn't explicitly specify the class effectively bypasses the firewall. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:21:23 +02:00
Dmitry Osipenko	80d3eef16e	drm/tegra: dc: Disable plane if it is invisible On Tegra20 if plane has width or height equal to 0, it will be infinitely wide or tall. Let's disable the plane if it is invisible on atomic state committing to fix the issue. The Rockchip DRM driver does the same. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:20:30 +02:00
Dmitry Osipenko	7d2058571a	drm/tegra: dc: Apply clipping to the plane On Tegra20 an overlay plane should be clipped, otherwise its output is distorted once plane crosses display boundary. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:20:11 +02:00
Ville Syrjälä	aecd36b8a1	drm/i915: Fix deadlock witha the pipe A quirk during resume Pass down the correct acquire context to the pipe A quirk load detect hack during display resume. Avoids deadlocking the entire thing. Cc: stable@vger.kernel.org Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> Fixes: e2c8b8701e2d ("drm/i915: Use atomic helpers for suspend, v2.") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170601143619.27840-2-ville.syrjala@linux.intel.com Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>	2017-06-15 15:19:32 +03:00
Dmitry Osipenko	6ac1571b4c	drm/tegra: dc: Avoid reset asserts on Tegra20 Commit 33a8eb8d40ee ("drm/tegra: dc: Implement runtime PM") introduced HW reset control. It causes a hang on Tegra20 if both display controllers are utilized (RGB panel and HDMI). The TRM suggests that each display controller has its own reset control, apparently it is not correct. Fixes: 33a8eb8d40ee ("drm/tegra: dc: Implement runtime PM") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:19:23 +02:00
Dmitry Osipenko	e0b2ce0210	drm/tegra: Check syncpoint ID in the 'submit' IOCTL In case of invalid syncpoint ID, the host1x_syncpt_get() returns NULL and none of its users perform a check of the returned pointer later. Let's bail out until it's too late. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:17:21 +02:00
Dmitry Osipenko	d0fbbdff2e	drm/tegra: Correct copying of waitchecks and disable them in the 'submit' IOCTL The waitchecks along with multiple syncpoints per submit are not ready for use yet, let's forbid them for now. Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:16:37 +02:00
Dmitry Osipenko	368f622c0d	drm/tegra: Check for malformed offsets and sizes in the 'submit' IOCTL If commands buffer claims a number of words that is higher than its BO can fit, a kernel OOPS will be fired on the out-of-bounds BO access. This was triggered by an opentegra Xorg driver that erroneously pushed too many commands to the pushbuf. The CDMA commands buffer address is 4 bytes aligned, so check its alignment. The maximum number of the CDMA gather fetches is 16383, add a check for it. Add a sanity check for the relocations in a same way. [ 46.829393] Unable to handle kernel paging request at virtual address f09b2000 ... [<c04a3ba4>] (host1x_job_pin) from [<c04dfcd0>] (tegra_drm_submit+0x474/0x510) [<c04dfcd0>] (tegra_drm_submit) from [<c04deea0>] (tegra_submit+0x50/0x6c) [<c04deea0>] (tegra_submit) from [<c04c07c0>] (drm_ioctl+0x1e4/0x3ec) [<c04c07c0>] (drm_ioctl) from [<c02541a0>] (do_vfs_ioctl+0x9c/0x8e4) [<c02541a0>] (do_vfs_ioctl) from [<c0254a1c>] (SyS_ioctl+0x34/0x5c) [<c0254a1c>] (SyS_ioctl) from [<c0107640>] (ret_fast_syscall+0x0/0x3c) Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 14:16:07 +02:00
Dmitry Osipenko	d6c153ec85	drm/tegra: Correct idr_alloc() minimum id The client ID 0 is reserved by the host1x/cdma to mark the timeout timer work as already been scheduled and context ID is used as the clients one. This fixes spurious CDMA timeouts. Fixes: bdd2f9cd10eb ("drm/tegra: Don't leak kernel pointer to userspace") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: http://patchwork.freedesktop.org/patch/msgid/9c19a44219acd988e678cf9abe21363911184625.1497480754.git.digetx@gmail.com	2017-06-15 14:12:25 +02:00
Dmitry Osipenko	1066a8959d	drm/tegra: Fix lockup on a use of staging API Commit bdd2f9cd10eb ("Don't leak kernel pointer to userspace") added a mutex around staging IOCTL's, some of those mutexes are taken twice. Fixes: bdd2f9cd10eb ("drm/tegra: Don't leak kernel pointer to userspace") Signed-off-by: Dmitry Osipenko <digetx@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Reviewed-by: Erik Faye-Lund <kusmabite@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: http://patchwork.freedesktop.org/patch/msgid/7b70a506a9d2355ea6ff19a8c4f4d726b67719b3.1497480754.git.digetx@gmail.com	2017-06-15 14:11:05 +02:00
Christophe JAILLET	59e04bc20d	gpu: host1x: Fix error handling If 'devm_reset_control_get' returns an error, then we erroneously return success because error code is taken from 'host->clk' instead of 'host->rst'. Fixes: b386c6b73ac6 ("gpu: host1x: Support module reset") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170410202922.17665-1-christophe.jaillet@wanadoo.fr	2017-06-15 14:06:49 +02:00
Thierry Reding	466749f13e	gpu: host1x: Flesh out kerneldoc Improve kerneldoc for the public parts of the host1x infrastructure in preparation for adding driver-specific part to the GPU documentation. Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Signed-off-by: Thierry Reding <treding@nvidia.com>	2017-06-15 13:58:43 +02:00
Chris Wilson	8c45cec48e	drm/i915: Split vma exec_link/evict_link Currently the vma has one link member that is used for both holding its place in the execbuf reservation list, and in any eviction list. This dual property is quite tricky and error prone. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Mika Kuoppala <mika.kuoppala@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170615081435.17699-3-chris@chris-wilson.co.uk	2017-06-15 10:53:26 +01:00
Chris Wilson	d55495b4dc	drm/i915: Use vma->exec_entry as our double-entry placeholder This has the benefit of not requiring us to manipulate the vma->exec_link list when tearing down the execbuffer, and is a marginally cheaper test to detect the user error. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170615081435.17699-2-chris@chris-wilson.co.uk	2017-06-15 10:52:58 +01:00
Chris Wilson	650bc63568	drm/i915: Amalgamate execbuffer parameter structures Combine the two slightly overlapping parameter structures we pass around the execbuffer routines into one. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170615081435.17699-1-chris@chris-wilson.co.uk	2017-06-15 10:50:35 +01:00
Lionel Landwerlin	28c7ef9ecc	drm/i915/perf: add GLK support Add OA support for Geminilake (pretty much identical to Broxton), and also add the associated OA configurations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net> Link: http://patchwork.freedesktop.org/patch/msgid/20170613112309.4088-2-lionel.g.landwerlin@intel.com	2017-06-14 12:31:58 -07:00
Lionel Landwerlin	6c5c1d89af	drm/i915/perf: add KBL support Add OA support for Kabylake (pretty much identical to Skylake), and also add the associated OA configurations. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2017-06-14 12:31:58 -07:00
Lionel Landwerlin	3891589eee	drm/i915: add KBL GT2/GT3 check macros Add macros to detect GT2/GT3 skus so we can apply the proper OA configuration later. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2017-06-14 12:31:57 -07:00
Robert Bragg	1bef3409f1	drm/i915/perf: remove perf.hook_lock In earlier iterations of the i915-perf driver we had a number of callbacks/hooks from other parts of the i915 driver to e.g. notify us when a legacy context was pinned and these could run asynchronously with respect to the stream file operations and might also run in atomic context. dev_priv->perf.hook_lock had been for serialising access to state needed within these callbacks, but as the code has evolved some of the hooks have gone away or are implemented to avoid needing to lock any state. The remaining use of this lock was actually redundant considering how the gen7 oacontrol state used to be updated as part of a context pin hook. Signed-off-by: Robert Bragg <robert@sixbynine.org> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2017-06-14 12:31:57 -07:00
Robert Bragg	155e941f49	drm/i915/perf: per-gen timebase for checking sample freq An oa_exponent_to_ns() utility and per-gen timebase constants where recently removed when updating the tail pointer race condition WA, and this restores those so we can update the _PROP_OA_EXPONENT validation done in read_properties_unlocked() to not assume we have a 12.5MHz timebase as we did for Haswell. Accordingly the oa_sample_rate_hard_limit value that's referenced by proc_dointvec_minmax defining the absolute limit for the OA sampling frequency is now initialized to (timestamp_frequency / 2) instead of the 6.25MHz constant for Haswell. v2: Specify frequency of 19.2MHz for BXT (Ville) Initialize oa_sample_rate_hard_limit per-gen too (Lionel) Signed-off-by: Robert Bragg <robert@sixbynine.org> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2017-06-14 12:31:57 -07:00
Robert Bragg	fc59921178	drm/i915/perf: Add more OA configs for BDW, CHV, SKL + BXT These are auto generated from an XML description of metric sets, currently maintained in gputop, ref: https://github.com/rib/gputop > gputop-data/oa-*.xml > scripts/i915-perf-kernelgen.py $ make -C gputop-data -f Makefile.xml Signed-off-by: Robert Bragg <robert@sixbynine.org> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2017-06-14 12:31:57 -07:00
Robert Bragg	19f81df285	drm/i915/perf: Add OA unit support for Gen 8+ Enables access to OA unit metrics for BDW, CHV, SKL and BXT which all share (more-or-less) the same OA unit design. Of particular note in comparison to Haswell: some OA unit HW config state has become per-context state and as a consequence it is somewhat more complicated to manage synchronous state changes from the cpu while there's no guarantee of what context (if any) is currently actively running on the gpu. The periodic sampling frequency which can be particularly useful for system-wide analysis (as opposed to command stream synchronised MI_REPORT_PERF_COUNT commands) is perhaps the most surprising state to have become per-context save and restored (while the OABUFFER destination is still a shared, system-wide resource). This support for gen8+ takes care to consider a number of timing challenges involved in synchronously updating per-context state primarily by programming all config state from the cpu and updating all current and saved contexts synchronously while the OA unit is still disabled. The driver intentionally avoids depending on command streamer programming to update OA state considering the lack of synchronization between the automatic loading of OACTXCONTROL state (that includes the periodic sampling state and enable state) on context restore and the parsing of any general purpose BB the driver can control. I.e. this implementation is careful to avoid the possibility of a context restore temporarily enabling any out-of-date periodic sampling state. In addition to the risk of transiently-out-of-date state being loaded automatically; there are also internal HW latencies involved in the loading of MUX configurations which would be difficult to account for from the command streamer (and we only want to enable the unit when once the MUX configuration is complete). Since the Gen8+ OA unit design no longer supports clock gating the unit off for a single given context (which effectively stopped any progress of counters while any other context was running) and instead supports tagging OA reports with a context ID for filtering on the CPU, it means we can no longer hide the system-wide progress of counters from a non-privileged application only interested in metrics for its own context. Although we could theoretically try and subtract the progress of other contexts before forwarding reports via read() we aren't in a position to filter reports captured via MI_REPORT_PERF_COUNT commands. As a result, for Gen8+, we always require the dev.i915.perf_stream_paranoid to be unset for any access to OA metrics if not root. v5: Drain submitted requests when enabling metric set to ensure no lite-restore erases the context image we just updated (Lionel) v6: In addition to drain, switch to kernel context & update all context in place (Chris) v7: Add missing mutex_unlock() if switching to kernel context fails (Matthew) v8: Simplify OA period/flex-eu-counters programming by using the batchbuffer instead of modifying ctx-image (Lionel) v9: Back to updating the context image (due to erroneous testing, batchbuffer programming the OA unit doesn't actually work) (Lionel) Pin context before updating context image (Chris) Drop MMIO programming now that we switch to a kernel context with right values in initial context image (Chris) v10: Just pin_map the contexts we want to modify or let the configuration happen on first use (Chris) v11: Update kernel context OA config through the batchbuffer rather than on the fly ctx-image update (Lionel) v12: Rework OA context registers update again by swithing away from user contexts and reconfiguring the kernel context through the batchbuffer and updating all the other contexts' context image. Also take care to lock slice/subslice configuration when OA is on. (Lionel) v13: Request rpcs updates on all engine when updating the OA config (Lionel) v14: Drop any kind of rpcs management now that we monitor sseu configuration changes in a later patch (Lionel) Remove usleep after programming the NOA configs on Gen8+, this doesn't seem to be needed (Lionel) v15: Respect coding style for block comments (Chris) v16: Add missing i915_add_request() in case we fail to emit OA configuration (Matthew) Signed-off-by: Robert Bragg <robert@sixbynine.org> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> \o/ Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2017-06-14 12:31:57 -07:00
Robert Bragg	5182f646c7	drm/i915/perf: Add 'render basic' Gen8+ OA unit configs Adds a static OA unit, MUX, B Counter + Flex EU configurations for basic render metrics on Broadwell, Cherryview, Skylake and Broxton. These are auto generated from an XML description of metric sets, currently maintained in gputop, ref: https://github.com/rib/gputop > gputop-data/oa-*.xml > scripts/i915-perf-kernelgen.py $ make -C gputop-data -f Makefile.xml WHITELIST=RenderBasic v2: add newlines to debug messages + fix comment (Matthew Auld) Signed-off-by: Robert Bragg <robert@sixbynine.org> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2017-06-14 12:31:57 -07:00
Lionel Landwerlin	3f488d9985	drm/i915/perf: rework mux configurations queries Gen8+ might have mux configurations per slices/subslices. Depending on whether slices/subslices have been fused off, only part of the configuration needs to be applied. This change reworks the mux configurations query mechanism to allow more than one set of registers to be programmed. v2: s/n_mux_regs/n_mux_configs/ (Matthew) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2017-06-14 12:31:57 -07:00
Robert Bragg	f532023381	drm/i915: expose _SUBSLICE_MASK GETPARM Assuming a uniform mask across all slices, this enables userspace to determine the specific sub slices can be enabled. This information is required, for example, to be able to analyse some OA counter reports where the counter configuration depends on the HW sub slice configuration. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2017-06-14 12:31:57 -07:00
Robert Bragg	7fed555c02	drm/i915: expose _SLICE_MASK GETPARM Enables userspace to determine the maximum number of slices that can be enabled on the device and also know what specific slices can be enabled. This information is required, for example, to be able to analyse some OA counter reports where the counter configuration depends on the HW slice configuration. Signed-off-by: Robert Bragg <robert@sixbynine.org> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Ben Widawsky <ben@bwidawsk.net>	2017-06-14 12:31:57 -07:00
Hoegeun Kwon	e2af12bfb0	drm/panel: s6e3ha2: Add support for s6e3hf2 panel on TM2e board This patch supports TM2e panel and the panel has 1600x2560 resolution in 5.65" physical. This identify panel type with compatibility string, also invoke display mode that matches the type. So add the check code for s6e3ha2 compatibility and s6e3hf2 type and select the drm_display_mode of default and edge type. Signed-off-by: Hoegeun Kwon <hoegeun.kwon@samsung.com> Reviewed-by: Andrzej Hajda <a.hajda@samsung.com> Reviewed-by: Inki Dae <inki.dae@samsung.com> [treding@nvidia.com: fixup checkpatch warnings] Signed-off-by: Thierry Reding <treding@nvidia.com> Link: http://patchwork.freedesktop.org/patch/msgid/1492504836-19225-3-git-send-email-hoegeun.kwon@samsung.com	2017-06-14 20:18:22 +02:00
Arnd Bergmann	2a49816821	drm/panel: add backlight dependency for sitronix-st7789v Without the dependency, we run into a link error: drivers/gpu/drm/panel/panel-sitronix-st7789v.o: In function `st7789v_probe': panel-sitronix-st7789v.c:(.text.st7789v_probe+0xc0): undefined reference to `of_find_backlight_by_node' Fixes: 7142afb3a186 ("drm/panel: Add driver for sitronix ST7789V LCD controller") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hoegeun Kwon <hoegeun.kwon@samsung.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170419180326.303994-1-arnd@arndb.de	2017-06-14 20:06:53 +02:00
Arnd Bergmann	93023c1404	drm/panel: S6E3HA2 needs backlight code The new S6E3HA2 driver fails to link when backlight is disabled: ERROR: "backlight_device_register" [drivers/gpu/drm/panel/panel-samsung-s6e3ha2.ko] undefined! ERROR: "backlight_device_unregister" [drivers/gpu/drm/panel/panel-samsung-s6e3ha2.ko] undefined! This adds a Kconfig dependency like we have it for some other panel drivers. Fixes: ed29f9426d9b ("drm/panel: Add support for S6E3HA2 panel driver on TM2 board") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Hoegeun Kwon <hoegeun.kwon@samsung.com> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170419175939.189098-2-arnd@arndb.de	2017-06-14 20:06:27 +02:00
Lucas Stach	70c0d5b783	drm/panel: simple: add support for AUO P320HVN03 This adds support for the AU Optronics Corporation 31.5" FHD (1920x1080) LVDS TFT LCD panel, which can be supported by the simple panel driver Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170608180758.31020-4-l.stach@pengutronix.de	2017-06-14 19:37:56 +02:00
Lucas Stach	4177fa66a3	drm/panel: simple: add support for NLT NL192108AC18-02D This adds support for the NLT Technologies NL192108AC18-02D 15.6" LVDS FullHD TFT LCD panel, which can be supported by the simple panel driver. Timings are taken from the preliminary datasheet, as a final one is not yet available. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Acked-by: Rob Herring <robh@kernel.org> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170608180758.31020-3-l.stach@pengutronix.de	2017-06-14 19:37:27 +02:00
Lucas Stach	01bacc13a3	drm/panel: simple: add support for NEC NL12880B20-05 This adds support for the NEC LCD Technologies, Ltd. 12.1" WXGA (1280x800) LVDS TFT LCD panel, which can be supported by the simple panel driver. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170608180758.31020-1-l.stach@pengutronix.de	2017-06-14 19:36:29 +02:00
Chris Zhong	14c8f2e9f8	drm/panel: add Innolux P079ZCA panel driver Support Innolux P079ZCA 7.85" 768x1024 TFT LCD panel, it is a MIPI DSI panel. Signed-off-by: Chris Zhong <zyw@rock-chips.com> Reviewed-by: Sean Paul <seanpaul@chromium.org> Tested-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Thierry Reding <treding@nvidia.com> Link: http://patchwork.freedesktop.org/patch/msgid/1490316692-20506-2-git-send-email-zyw@rock-chips.com	2017-06-14 19:31:45 +02:00
Mario Kleiner	55f61a040e	drm/radeon: Fix overflow of watermark calcs at > 4k resolutions. Commit e6b9a6c84b93 ("drm/radeon: Make display watermark calculations more accurate") made watermark calculations more accurate, but not for > 4k resolutions on 32-Bit architectures, as it introduced an integer overflow for those setups and resolutions. Fix this by proper u64 casting and division. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Fixes: e6b9a6c84b93 ("drm/radeon: Make display watermark calculations more accurate") Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-06-14 09:25:58 -04:00
Mario Kleiner	bea1041393	drm/amdgpu: Fix overflow of watermark calcs at > 4k resolutions. Commit d63c277dc672e0 ("drm/amdgpu: Make display watermark calculations more accurate") made watermark calculations more accurate, but not for > 4k resolutions on 32-Bit architectures, as it introduced an integer overflow for those setups and resolutions. Fix this by proper u64 casting and division. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Reported-by: Ben Hutchings <ben.hutchings@codethink.co.uk> Fixes: d63c277dc672 ("drm/amdgpu: Make display watermark calculations more accurate") Cc: Ben Hutchings <ben.hutchings@codethink.co.uk> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-06-14 09:25:05 -04:00
Christian König	c0f83da96b	drm/radeon: fix "force the UVD DPB into VRAM as well" The DPB must be in VRAM, but not in the first segment. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Arthur Marsh <arthur.marsh@internode.on.net> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2017-06-14 09:23:42 -04:00
Chris Wilson	9ee82d783e	drm/i915: Reinstate reservation_object zapping for batch_pool objects I removed the zapping of the reservation_object->fence array of shared fences prematurely. We don't yet have the code to zap that array when retiring the object, and so currently it remains possible to continually grow the shared array trapping requests when reusing the batch_pool object across many timelines. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170518094638.5469-4-chris@chris-wilson.co.uk	2017-06-14 14:06:22 +01:00
Chris Wilson	290271de34	drm/i915: Spin for struct_mutex inside shrinker Having resolved whether or not we would deadlock upon a call to mutex_lock(&dev->struct_mutex), we can then spin for the contended struct_mutex if we are not the owner. We cannot afford to simply block and wait for the mutex, as the owner may itself be waiting for the allocator -- i.e. a cyclic deadlock. This should significantly improve the chance of running the shrinker for other processes whilst the GPU is busy. A more balanced approach would be to optimistically spin whilst the mutex owner was on the cpu and there was an opportunity to acquire the mutex for ourselves quickly. However, that requires support from kernel/locking/ and a new mutex_spin_trylock() primitive. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-4-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-06-14 10:55:11 +01:00
Chris Wilson	0f6ab55d7a	drm/i915: Only restrict noreclaim in the early shrink passes In our first pass, we do not want to use reclaim at all as we want to solely reap the i915 buffer caches (its purgeable pages). But we don't mind it initiates IO or pulls via the FS (but it shouldn't anyway as we say no to reclaim!). Just drop the GFP_IO constraint for simplicity. Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Mika Kuoppala <mika.kuoppala@linux.intel.com> Link: http://patchwork.freedesktop.org/patch/msgid/20170609110350.1767-3-chris@chris-wilson.co.uk Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2017-06-14 10:53:37 +01:00

... 3 4 5 6 7 ...

36998 Commits