104 Commits

Author SHA1 Message Date
Matt Roper
0d6419e9c8 drm/i915: Move GT registers to their own header file
This is a huge, chaotic mass of registers copied over as-is without any
real cleanup.  We'll come back and organize these better, align on
consistent coding style, remove dead code, etc. in separate patches
later that will be easier to review.

v2:
 - Add missing include in intel_pxp_irq.c
v3:
 - Correct a few indentation errors (Lucas)
 - Minor conflict resolution

Cc: Jani Nikula <jani.nikula@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220127234334.4016964-6-matthew.d.roper@intel.com
2022-02-02 07:59:14 -08:00
Rodrigo Vivi
063565aca3 Merge drm/drm-next into drm-intel-next
Catch-up with 5.17-rc2 and trying to align with drm-intel-gt-next
for a possible topic branch for merging the split of i915_regs...

Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
2022-01-31 13:19:33 -05:00
Matt Roper
202b1f4c12 drm/i915/gt: Move engine registers to their own header
Let's continue breaking up and cleaning up the massive i915_reg.h file
by moving all registers that are defined in relation to an engine base
to their own header.

There are probably a bunch of other "engine registers" that we haven't
moved yet (especially those that belong to the render engine in the
0x2??? range), but this is a relatively straightforward first step.

Cc: Jani Nikula <jani.nikula@linux.intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220111051600.3429104-8-matthew.d.roper@intel.com
2022-01-11 14:03:25 -08:00
Michał Winiarski
2cbc876daa drm/i915: Use to_gt() helper
Use to_gt() helper consistently throughout the codebase.
Pure mechanical s/i915->gt/to_gt(i915). No functional changes.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Signed-off-by: Andi Shyti <andi.shyti@linux.intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Signed-off-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20211214193346.21231-10-andi.shyti@linux.intel.com
2021-12-17 21:51:59 -08:00
Vinay Belgaumkar
41e5c17ebf drm/i915/guc/slpc: Sysfs hooks for SLPC
Update the get/set min/max freq hooks to work for
SLPC case as well. Consolidate helpers for requested/min/max
frequency get/set to intel_rps where the proper action can
be taken depending on whether SLPC is enabled.

v2: Add wrappers for getting rp0/1/n frequencies, update
softlimits in set min/max SLPC functions. Also check for
boundary conditions before setting them.

v3: Address review comments (Michal W)

v4: Add helper for host part of intel_rps_set_freq helpers (Michal W)

v5: checkpatch()

Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Acked-by: Michal Wajdeczko <michal.wajdeczko@intel.com>
Signed-off-by: Vinay Belgaumkar <vinay.belgaumkar@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Signed-off-by: Sujaritha Sundaresan <sujaritha.sundaresan@intel.com>
Signed-off-by: John Harrison <John.C.Harrison@Intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210730202119.23810-13-vinay.belgaumkar@intel.com
2021-08-03 16:05:40 -07:00
Jason Ekstrand
a04ea6ae7c drm/i915: Use a table for i915_init/exit (v2)
If the driver was not fully loaded, we may still have globals lying
around.  If we don't tear those down in i915_exit(), we'll leak a bunch
of memory slabs.  This can happen two ways: use_kms = false and if we've
run mock selftests.  In either case, we have an early exit from
i915_init which happens after i915_globals_init() and we need to clean
up those globals.

The mock selftests case is especially sticky.  The load isn't entirely
a no-op.  We actually do quite a bit inside those selftests including
allocating a bunch of mock objects and running tests on them.  Once all
those tests are complete, we exit early from i915_init().  Perviously,
i915_init() would return a non-zero error code on failure and a zero
error code on success.  In the success case, we would get to i915_exit()
and check i915_pci_driver.driver.owner to detect if i915_init exited early
and do nothing.  In the failure case, we would fail i915_init() but
there would be no opportunity to clean up globals.

The most annoying part is that you don't actually notice the failure as
part of the self-tests since leaking a bit of memory, while bad, doesn't
result in anything observable from userspace.  Instead, the next time we
load the driver (usually for next IGT test), i915_globals_init() gets
invoked again, we go to allocate a bunch of new memory slabs, those
implicitly create debugfs entries, and debugfs warns that we're trying
to create directories and files that already exist.  Since this all
happens as part of the next driver load, it shows up in the dmesg-warn
of whatever IGT test ran after the mock selftests.

While the obvious thing to do here might be to call i915_globals_exit()
after selftests, that's not actually safe.  The dma-buf selftests call
i915_gem_prime_export which creates a file.  We call dma_buf_put() on
the resulting dmabuf which calls fput() on the file.  However, fput()
isn't immediate and gets flushed right before syscall returns.  This
means that all the fput()s from the selftests don't happen until right
before the module load syscall used to fire off the selftests returns
which is after i915_init().  If we call i915_globals_exit() in
i915_init() after selftests, we end up freeing slabs out from under
objects which won't get released until fput() is flushed at the end of
the module load syscall.

The solution here is to let i915_init() return success early and detect
the early success in i915_exit() and only tear down globals and nothing
else.  This way the module loads successfully, regardless of the success
or failure of the tests.  Because we've not enumerated any PCI devices,
no device nodes are created and it's entirely useless from userspace.
The only thing the module does at that point is hold on to a bit of
memory until we unload it and i915_exit() is called.  Importantly, this
means that everything from our selftests has the ability to properly
flush out between i915_init() and i915_exit() because there is at least
one syscall boundary in between.

In order to handle all the delicate init/exit cases, we convert the
whole thing to a table of init/exit pairs and track the init status in
the new init_progress global.  This allows us to ensure that i915_exit()
always tears down exactly the things that i915_init() successfully
initialized.  We also allow early-exit of i915_init() without failure by
an init function returning > 0.  This is useful for nomodeset, and
selftests.  For the mock selftests, we convert them to always return 1
so we get the desired behavior of the driver always succeeding to load
the driver and then properly tearing down the partially loaded driver.

v2 (Tvrtko Ursulin):
 - Guard init_funcs[i].exit with GEM_BUG_ON(i >= ARRAY_SIZE(init_funcs))
v2 (Daniel Vetter):
 - Update the docstring for i915.mock_selftests

Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20210721152358.2893314-4-jason@jlekstrand.net
2021-07-22 12:05:17 +02:00
Dave Airlie
2a7005c8a3 Merge tag 'drm-intel-gt-next-2021-06-10' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:

- Disable mmap ioctl for gen12+ (excl. TGL-LP)
- Start enabling HuC loading by default for upcoming Gen12+
  platforms (excludes TGL and RKL)

Core Changes:

- Backmerge of drm-next

Driver Changes:

- Revert "i915: use io_mapping_map_user" (Eero, Matt A)
- Initialize the TTM device and memory managers (Thomas)
- Major rework to the GuC submission backend to prepare
  for enabling on new platforms (Michal Wa., Daniele,
  Matt B, Rodrigo)
- Fix i915_sg_page_sizes to record dma segments rather
  than physical pages (Thomas)

- Locking rework to prep for TTM conversion (Thomas)
- Replace IS_GEN and friends with GRAPHICS_VER (Lucas)
- Use DEVICE_ATTR_RO macro (Yue)
- Static code checker fixes (Zhihao)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YMHeDxg9VLiFtyn3@jlahtine-mobl.ger.corp.intel.com
2021-06-11 13:37:34 +10:00
Dave Airlie
a2098e857b Cross-subsystem Changes:
-  x86/gpu: add JasperLake to gen11 early quirks
   (Although the patch lacks the Ack info, it has been Acked by Borislav)
 
 Driver Changes:
 
 - General DMC improves (Anusha)
 - More ADL-P enabling (Vandita, Matt, Jose, Mika, Anusha, Imre, Lucas, Jani, Manasi, Ville, Stanislav)
 - Introduce MBUS relative dbuf offset (Ville)
 - PSR fixes and improvements (Gwan, Jose, Ville)
 - Re-enable LTTPR non-transparent LT mode for DPCD_REV < 1.4 (Ville)
 - Remove duplicated declarations (Shaokun, Wan)
 - Check HDMI sink deep color capabilities during .mode_valid (Ville)
 - Fix display flicker screan related to console and FBC (Chris)
 - Remaining conversions of GRAPHICS_VER (Lucas)
 - Drop invalid FIXME (Jose)
 - Fix bigjoiner check in dsc_disable (Vandita)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEbSBwaO7dZQkcLOKj+mJfZA7rE8oFAmDBMp8ACgkQ+mJfZA7r
 E8rkngf/cq6JI3nLmQpNSoDJ1VosuuLgVKGMeL+NR4UmHqsjzaxTL7evaJzf38mS
 wDaTvB3eEUKAFuvIY/US6xO3gPXb1TtmJ4UBizzkK7DOeh53LXvrxX+ifdg6RXx9
 7WsNvnUMItGX5+CRtHeWqmqptBCXTup1ntjAvTOKc9S20gshDHX0/eyk04Ub5FOb
 cVgt9FoDhTVY6Z2wWG9G0pezbuWc3rDMei+cboXUXCx+QEjjdYNyrb32UT6e1Qfm
 oBWRhOMTe+aJtbGen+l134I1uS3XCfjZ8zHVqLXMUhCJ443yB0LEhPdk56PJSD9F
 MoKujBlyxF1dM7SDQ/h6+7uhpvOkvA==
 =0nIT
 -----END PGP SIGNATURE-----

Merge tag 'drm-intel-next-2021-06-09' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

Cross-subsystem Changes:

-  x86/gpu: add JasperLake to gen11 early quirks
  (Although the patch lacks the Ack info, it has been Acked by Borislav)

Driver Changes:

- General DMC improves (Anusha)
- More ADL-P enabling (Vandita, Matt, Jose, Mika, Anusha, Imre, Lucas, Jani, Manasi, Ville, Stanislav)
- Introduce MBUS relative dbuf offset (Ville)
- PSR fixes and improvements (Gwan, Jose, Ville)
- Re-enable LTTPR non-transparent LT mode for DPCD_REV < 1.4 (Ville)
- Remove duplicated declarations (Shaokun, Wan)
- Check HDMI sink deep color capabilities during .mode_valid (Ville)
- Fix display flicker screan related to console and FBC (Chris)
- Remaining conversions of GRAPHICS_VER (Lucas)
- Drop invalid FIXME (Jose)
- Fix bigjoiner check in dsc_disable (Vandita)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YMEy2Ew82BeL/hDK@intel.com
2021-06-10 13:45:11 +10:00
Lucas De Marchi
651e7d4857 drm/i915: replace IS_GEN and friends with GRAPHICS_VER
This was done by the following semantic patch:

	@@ expression i915; @@
	- INTEL_GEN(i915)
	+ GRAPHICS_VER(i915)

	@@ expression i915; expression E; @@
	- INTEL_GEN(i915) >= E
	+ GRAPHICS_VER(i915) >= E

	@@ expression dev_priv; expression E; @@
	- !IS_GEN(dev_priv, E)
	+ GRAPHICS_VER(dev_priv) != E

	@@ expression dev_priv; expression E; @@
	- IS_GEN(dev_priv, E)
	+ GRAPHICS_VER(dev_priv) == E

	@@
	expression dev_priv;
	expression from, until;
	@@
	- IS_GEN_RANGE(dev_priv, from, until)
	+ IS_GRAPHICS_VER(dev_priv, from, until)

	@def@
	expression E;
	identifier id =~ "^gen$";
	@@
	- id = GRAPHICS_VER(E)
	+ ver = GRAPHICS_VER(E)

	@@
	identifier def.id;
	@@
	- id
	+ ver

It also takes care of renaming the variable we assign to GRAPHICS_VER()
so to use "ver" rather than "gen".

Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>
Reviewed-by: Matt Roper <matthew.d.roper@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210606045050.103862-2-lucas.demarchi@intel.com
2021-06-07 00:59:48 -07:00
YueHaibing
177f30c6c1 drm/i915: use DEVICE_ATTR_RO macro
Use DEVICE_ATTR_RO() helper instead of plain DEVICE_ATTR(),
which makes the code a bit shorter and easier to read.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210528100403.21548-1-yuehaibing@huawei.com
2021-06-02 10:21:16 +01:00
Tvrtko Ursulin
399cd97970 drm/i915/pmu: Check actual RC6 status
RC6 support cannot be simply established by looking at the static device
HAS_RC6() flag. There are cases which disable RC6 at driver load time so
use the status of those check when deciding whether to enumerate the rc6
counter.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reported-by: Eero T Tamminen <eero.t.tamminen@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210331101850.2582027-1-tvrtko.ursulin@linux.intel.com
2021-04-09 14:49:15 +01:00
Thomas Zimmermann
8ff5446a7c drm/i915: Remove references to struct drm_device.pdev
Using struct drm_device.pdev is deprecated. Convert i915 to struct
drm_device.dev. No functional changes.

v6:
	* also remove assignment in selftests/ in a later patch (Chris)
v5:
	* remove assignment in later patch (Chris)
v3:
	* rebased
v2:
	* move gt/ and gvt/ changes into separate patches

Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Jani Nikula <jani.nikula@linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210128133127.2311-2-tzimmermann@suse.de
2021-02-02 13:58:42 +02:00
Dave Airlie
fb5cfcaa2e Merge tag 'drm-intel-gt-next-2021-01-14' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
UAPI Changes:
- Deprecate I915_PMU_LAST and optimize state tracking (Tvrtko)

  Avoid relying on last item ABI marker in i915_drm.h, add a
  comment to mark as deprecated.

Cross-subsystem Changes:

Core Changes:

Driver Changes:

- Restore clear residuals security mitigations for Ivybridge and
  Baytrail (Chris)
- Close #1858: Allow sysadmin to choose applied GPU security mitigations
  through i915.mitigations=... similar to CPU (Chris)
- Fix for #2024: GPU hangs on HSW GT1 (Chris)
- Fix for #2707: Driver hang when editing UVs in Blender (Chris, Ville)
- Fix for #2797: False positive GuC loading error message (Chris)
- Fix for #2859: Missing GuC firmware for older Cometlakes (Chris)
- Lessen probability of GPU hang due to DMAR faults [reason 7,
  next page table ptr is invalid] on Tigerlake (Chris)
- Fix REVID macros for TGL to fetch correct stepping (Aditya)
- Limit frequency drop to RPe on parking (Chris, Edward)
- Limit W/A 1406941453 to TGL, RKL and DG1 (Swathi)
- Make W/A 22010271021 permanent on DG1 (Lucas)
- Implement W/A 16011163337 to prevent a HS/DS hang on DG1 (Swathi)
- Only disable preemption on gen8 render engines (Chris)
- Disable arbitration around Braswell's PDP updates (Chris)
- Disable arbitration on no-preempt requests (Chris)
- Check for arbitration after writing start seqno before busywaiting (Chris)
- Retain default context state across shrinking (Venkata, CQ)
- Fix mismatch between misplaced vma check and vma insert for 32-bit
  addressing userspaces (Chris, CQ)
- Propagate error for vmap() failure instead kernel NULL deref (Chris)
- Propagate error from cancelled submit due to context closure
  immediately (Chris)
- Fix RCU race on HWSP tracking per request (Chris)
- Clear CMD parser shadow and GPU reloc batches (Matt A)

- Populate logical context during first pin (Maarten)
- Optimistically prune dma-resv from the shrinker (Chris)
- Fix for virtual engine ownership race (Chris)
- Remove timeslice suppression to restore fairness for virtual engines (Chris)
- Rearrange IVB/HSW workarounds properly between GT and engine (Chris)
- Taint the reset mutex with the shrinker (Chris)
- Replace direct submit with direct call to tasklet (Chris)
- Multiple corrections to virtual engine dequeue and breadcrumbs code (Chris)
- Avoid wakeref from potentially hard IRQ context in PMU (Tvrtko)
- Use raw clock for RC6 time estimation in PMU (Tvrtko)
- Differentiate OOM failures from invalid map types (Chris)
- Fix Gen9 to have 64 MOCS entries similar to Gen11 (Chris)
- Ignore repeated attempts to suspend request flow across reset (Chris)
- Remove livelock from "do_idle_maps" VT-d W/A (Chris)
- Cancel the preemption timeout early in case engine reset fails (Chris)
- Code flow optimization in the scheduling code (Chris)
- Clear the execlists timers upon reset (Chris)
- Drain the breadcrumbs just once (Chris, Matt A)
- Track the overall GT awake/busy time (Chris)
- Tweak submission tasklet flushing to avoid starvation (Chris)
- Track timelines created using the HWSP to restore on resume (Chris)
- Use cmpxchg64 for 32b compatilibity for active tracking (Chris)
- Prefer recycling an idle GGTT fence to avoid GPU wait (Chris)

- Restructure GT code organization for clearer split between GuC
  and execlists (Chris, Daniele, John, Matt A)
- Remove GuC code that will remain unused by new interfaces (Matt B)
- Restructure the CS timestamp clocks code to local to GT (Chris)
- Fix error return paths in perf code (Zhang)
- Replace idr_init() by idr_init_base() in perf (Deepak)
- Fix shmem_pin_map error path (Colin)
- Drop redundant free_work worker for GEM contexts (Chris, Mika)
- Increase readability and understandability of intel_workarounds.c (Lucas)
- Defer enabling the breadcrumb interrupt to after submission (Chris)
- Deal with buddy alloc block sizes beyond 4G (Venkata, Chris)
- Encode fence specific waitqueue behaviour into the wait.flags (Chris)
- Don't cancel the breadcrumb interrupt shadow too early (Chris)
- Cancel submitted requests upon context reset (Chris)
- Use correct locks in GuC code (Tvrtko)
- Prevent use of engine->wa_ctx after error (Chris, Matt R)

- Fix build warning on 32-bit (Arnd)
- Avoid memory leak if platform would have more than 16 W/A (Tvrtko)
- Avoid unnecessary #if CONFIG_PM in PMU code (Chris, Tvrtko)
- Improve debugging output (Chris, Tvrtko, Matt R)
- Make file local variables static (Jani)
- Avoid uint*_t types in i915 (Jani)
- Selftest improvements (Chris, Matt A, Dan)
- Documentation fixes (Chris, Jose)

Signed-off-by: Dave Airlie <airlied@redhat.com>

# Conflicts:
#	drivers/gpu/drm/i915/gt/intel_breadcrumbs.c
#	drivers/gpu/drm/i915/gt/intel_breadcrumbs_types.h
#	drivers/gpu/drm/i915/gt/intel_lrc.c
#	drivers/gpu/drm/i915/gvt/mmio_context.h
#	drivers/gpu/drm/i915/i915_drv.h
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20210114152232.GA21588@jlahtine-mobl.ger.corp.intel.com
2021-01-15 15:03:36 +10:00
Linus Torvalds
3913d00ac5 A treewide cleanup of interrupt descriptor (ab)use with all sorts of racy
accesses, inefficient and disfunctional code. The goal is to remove the
 export of irq_to_desc() to prevent these things from creeping up again.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/ifgsTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoYm6EACAo8sObkuY3oWLagtGj1KHxon53oGZ
 VfDw2LYKM+rgJjDWdiyocyxQU5gtm6loWCrIHjH2adRQ4EisB5r8hfI8NZHxNMyq
 8khUi822NRBfFN6SCpO8eW9o95euscNQwCzqi7gV9/U/BAKoDoSEYzS4y0YmJlup
 mhoikkrFiBuFXplWI0gbP4ihb8S/to2+kTL6o7eBoJY9+fSXIFR3erZ6f3fLjYZG
 CQUUysTywdDhLeDkC9vaesXwgdl2XnaPRwcQqmK8Ez0QYNYpawyILUHLD75cIHDu
 bHdK2ZoDv/wtad/3BoGTK3+wChz20a/4/IAnBIUVgmnSLsPtW8zNEOPWNNc0aGg+
 rtafi5bvJ1lMoSZhkjLWQDOGU6vFaXl9NkC2fpF+dg1skFMT2CyLC8LD/ekmocon
 zHAPBva9j3m2A80hI3dUH9azo/IOl1GHG8ccM6SCxY3S/9vWSQChNhQDLe25xBEO
 VtKZS7DYFCRiL8mIy9GgwZWof8Vy2iMua2ML+W9a3mC9u3CqSLbCFmLMT/dDoXl1
 oHnMdAHk1DRatA8pJAz83C75RxbAS2riGEqtqLEQ6OaNXn6h0oXCanJX9jdKYDBh
 z6ijWayPSRMVktN6FDINsVNFe95N4GwYcGPfagIMqyMMhmJDic6apEzEo7iA76lk
 cko28MDqTIK4UQ==
 =BXv+
 -----END PGP SIGNATURE-----

Merge tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull irq updates from Thomas Gleixner:
 "This is the second attempt after the first one failed miserably and
  got zapped to unblock the rest of the interrupt related patches.

  A treewide cleanup of interrupt descriptor (ab)use with all sorts of
  racy accesses, inefficient and disfunctional code. The goal is to
  remove the export of irq_to_desc() to prevent these things from
  creeping up again"

* tag 'irq-core-2020-12-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits)
  genirq: Restrict export of irq_to_desc()
  xen/events: Implement irq distribution
  xen/events: Reduce irq_info:: Spurious_cnt storage size
  xen/events: Only force affinity mask for percpu interrupts
  xen/events: Use immediate affinity setting
  xen/events: Remove disfunct affinity spreading
  xen/events: Remove unused bind_evtchn_to_irq_lateeoi()
  net/mlx5: Use effective interrupt affinity
  net/mlx5: Replace irq_to_desc() abuse
  net/mlx4: Use effective interrupt affinity
  net/mlx4: Replace irq_to_desc() abuse
  PCI: mobiveil: Use irq_data_get_irq_chip_data()
  PCI: xilinx-nwl: Use irq_data_get_irq_chip_data()
  NTB/msi: Use irq_has_action()
  mfd: ab8500-debugfs: Remove the racy fiddling with irq_desc
  pinctrl: nomadik: Use irq_has_action()
  drm/i915/pmu: Replace open coded kstat_irqs() copy
  drm/i915/lpe_audio: Remove pointless irq_to_desc() usage
  s390/irq: Use irq_desc_kstat_cpu() in show_msi_interrupt()
  parisc/irq: Use irq_desc_kstat_cpu() in show_interrupts()
  ...
2020-12-24 13:50:23 -08:00
Chris Wilson
8c3b1ba0e7 drm/i915/gt: Track the overall awake/busy time
Since we wake the GT up before executing a request, and go to sleep as
soon as it is retired, the GT wake time not only represents how long the
device is powered up, but also provides a summary, albeit an overestimate,
of the device runtime (i.e. the rc0 time to compare against rc6 time).

v2: s/busy/awake/
v3: software-gt-awake-time and I915_PMU_SOFTWARE_GT_AWAKE_TIME

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Matthew Brost <matthew.brost@intel.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201215154456.13954-1-chris@chris-wilson.co.uk
2020-12-17 22:26:38 +00:00
Thomas Gleixner
9c6508b9d2 drm/i915/pmu: Replace open coded kstat_irqs() copy
Driver code has no business with the internals of the irq descriptor.

Aside of that the count is per interrupt line and therefore takes
interrupts from other devices into account which share the interrupt line
and are not handled by the graphics driver.

Replace it with a pmu private count which only counts interrupts which
originate from the graphics card.

To avoid atomics or heuristics of some sort make the counter field
'unsigned long'. That limits the count to 4e9 on 32bit which is a lot and
postprocessing can easily deal with the occasional wraparound.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>
Link: https://lore.kernel.org/r/20201210194043.957046529@linutronix.de
2020-12-15 16:19:32 +01:00
Tvrtko Ursulin
c41ce8199d drm/i915/pmu: Remove !CONFIG_PM code
Chris spotted that since 16ffe73c186b ("drm/i915/pmu: Use GT parked for
estimating RC6 while asleep") we don't rely on runtime pm internals when
estimating RC6 while asleep. We can remove the ifdef code to simplify and
at the same time wake up the device less when querying RC6 if CONFIG_PM is
not compiled in.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: 16ffe73c186b ("drm/i915/pmu: Use GT parked for estimating RC6 while asleep")
Reported-by: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214094349.3563876-3-tvrtko.ursulin@linux.intel.com
2020-12-14 12:55:51 +00:00
Tvrtko Ursulin
c51c29fb35 drm/i915/pmu: Use raw clock for rc6 estimation
RC6 is a hardware counter and as such estimating it using the raw clock
during runtime suspend is more appropriate.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
References: 34f439278cef ("perf: Add per event clockid support")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214094349.3563876-2-tvrtko.ursulin@linux.intel.com
2020-12-14 12:55:48 +00:00
Tvrtko Ursulin
dbe13ae1d6 drm/i915/pmu: Don't grab wakeref when enabling events
Chris found a CI report which points out calling intel_runtime_pm_get from
inside i915_pmu_enable hook is not allowed since it can be invoked from
hard irq context. This is something we knew but forgot, so lets fix it
once again.

We do this by syncing the internal book keeping with hardware rc6 counter
on driver load.

v2:
 * Always sync on parking and fully sync on init.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: f4e9894b6952 ("drm/i915/pmu: Correct the rc6 offset upon enabling")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201214094349.3563876-1-tvrtko.ursulin@linux.intel.com
2020-12-14 12:55:43 +00:00
Tvrtko Ursulin
348fb0cb0a drm/i915/pmu: Deprecate I915_PMU_LAST and optimize state tracking
Adding any kinds of "last" abi markers is usually a mistake which I
repeated when implementing the PMU because it felt convenient at the time.

This patch marks I915_PMU_LAST as deprecated and stops the internal
implementation using it for sizing the event status bitmask and array.

New way of sizing the fields is a bit less elegant, but it omits reserving
slots for tracking events we are not interested in, and as such saves some
runtime space. Adding sampling events is likely to be a special event and
the new plumbing needed will be easily detected in testing. Existing
asserts against the bitfield and array sizes are keeping the code safe.

First event which gets the new treatment in this new scheme are the
interrupts - which neither needs any tracking in i915 pmu nor needs
waking up the GPU to read it.

v2:
 * Streamline helper names. (Chris)

v3:
 * Comment which events need tracking. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201201131757.206367-1-tvrtko.ursulin@linux.intel.com
2020-12-02 12:18:13 +00:00
Dave Airlie
334a168393 Merge tag 'drm-intel-gt-next-2020-11-12-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-next
Cross-subsystem Changes:
- DMA mapped scatterlist fixes in i915 to unblock merging of
  https://lkml.org/lkml/2020/9/27/70 (Tvrtko, Tom)

Driver Changes:

- Fix for user reported issue #2381 (Graphical output stops with "switching to inteldrmfb from simple"):
  Mark ininitial fb obj as WT on eLLC machines to avoid rcu lockup during fbdev init (Ville, Chris)
- Fix for Tigerlake (and earlier) to avoid spurious empty CSB events leading to hang (Chris, Bruce)
- Delay execlist processing for Tigerlake to avoid hang (Chris)
- Fix for Tigerlake RCS engine health check through heartbeat (Chris)
- Fix for Tigerlake reserved MOCS entries (Ayaz, Chris)
- Fix Media power gate sequence on Tigerlake (Rodrigo)
- Enable eLLC caching of display buffers for SKL+ (Ville)
- Support parsing of oversize batches on Gen9 (Matt, Chris)
- Exclude low pages (128KiB) of stolen from use to avoid thrashing during reset (Chris)
- Flush engines before Tigerlake breadcrumbs (Chris)

- Use the local HWSP offset during submission (Chris)
- Flush coherency domains on first set-domain-ioctl (Chris, Zbigniew)
- Use the active reference on the vma while capturing to avoid use-after-free (Chris)
- Fix MOCS PTE setting for gen9+ (Ville)
- Avoid NULL dereference on IPS driver callback while unbinding i915 (Chris)
- Avoid NULL dereference from PT/PD stash allocation error (Matt)
- Hold request reference for canceling an active context (Chris)
- Avoid infinite loop on x86-32 when mapping a lot of objects (Chris)
- Disallow WC mappings when processor doesn't support them (Chris)
- Return correct error in i915_gem_object_copy_blt() error path (Dan)
- Return correct error in intel_context_create_request() error path (Maarten)
- Tune down GuC communication enabled/disabled messages to debug (Jani)
- Fix rebased commit "Remove i915_request.lock requirement for execution callbacks" (Chris)
- Cancel outstanding work after disabling heartbeats on an engine (Chris)
- Signal cancelled requests (Chris)
- Retire cancelled requests on unload (Chris)
- Scrub HW state on driver remove (Chris)
- Undo forced context restores after trivial preemptions (Chris)
- Handle PCI unbind in PMU code (Tvrtko)
- Fix CPU hotplug with multiple GPUs in PMU code (Trtkko)
- Correctly set SFC capability for video engines (Venkata)

- Update GuC code to use firmware v49.0.1 (John, Matthew B., Daniele, Oscar, Michel, Rodrigo, Michal)
- Improve GuC warnings on loading failure (John)
- Avoid ownership race in buffer pool by clearing age (Chris)
- Use MMIO to read CSB in case of failure (Chris, Mika)
- Show engine properties in engine state dump to indicate changes (Chris, Joonas)
- Break up error capture compression loops with cond_resched() (Chris)
- Reduce GPU error capture mutex hold time to avoid khungtaskd (Chris)
- Serialise debugfs i915_gem_objects with ctx->mutex (Chris)
- Always test execution status on closing the context and close if not persistent (Chris)
- Avoid mixing integer types during batch copies (Chris, Jared)
- Skip over MI_NOOP when parsing to avoid overhead (Chris)
- Hold onto an explicit ref to i915_vma_work.pinned (Chris)
- Perform all asynchronous waits prior to marking payload start (Chris)
- Pull phys pread/pwrite implementations to the backend (Matt)

- Improve record of hung engines in error state (Tvrtko)
- Allow backends to override pread implementation (Matt)
- Reinforce LRC poisoning checks to confirm context survives execution (Chris)
- Fix memory region max size calculation (Matt)
- Fix order when adding blocks to memory region (Matt)
- Eliminate unused intel_virtual_engine_get_sibling func (Chris)
- Cleanup kasan warning for on-stack (unsigned long) casting (Chris)
- Onion unwind for scratch page allocation failure (Chris)
- Poison stolen pages before use (Chris)
- Selftest improvements (Chris)

Signed-off-by: Dave Airlie <airlied@redhat.com>

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20201112163407.GA20320@jlahtine-mobl.ger.corp.intel.com
2020-11-13 15:01:57 +10:00
Tvrtko Ursulin
537f9c84a4 drm/i915/pmu: Fix CPU hotplug with multiple GPUs
Since we keep a driver global mask of online CPUs and base the decision
whether PMU needs to be migrated upon it, we need to make sure the
migration is done for all registered PMUs (so GPUs).

To do this we need to track the current CPU for each PMU and base the
decision on whether to migrate on a comparison between global and local
state.

At the same time, since dynamic CPU hotplug notification slots are a
scarce resource and given how we already register the multi instance type
state, we can and should add multiple instance of the i915 PMU to this
same state and not allocate a new one for every GPU.

v2:
 * Use pr_notice. (Chris)

v3:
 * Handle a nasty interaction where unregistration which triggers a false
   CPU offline event. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Suggested-by: Daniel Vetter <daniel.vetter@intel.com> # dynamic slot optimisation
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020161144.678668-1-tvrtko.ursulin@linux.intel.com
2020-10-22 10:06:34 +01:00
Tvrtko Ursulin
b00bccb3f0 drm/i915/pmu: Handle PCI unbind
Mark the device as closed and keep references to driver data alive to
allow for safe driver unbind with active PMU clients. Perf core does not
otherwise handle this case so we have to do it manually like this.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20201020100822.543332-1-tvrtko.ursulin@linux.intel.com
2020-10-22 10:06:29 +01:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Chris Wilson
df3ab3cb7e drm/i915: Provide the perf pmu.module
Rather than manually implement our own module reference counting for perf
pmu events, finally realise that there is a module parameter to struct
pmu for this very purpose.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200716094643.31410-1-chris@chris-wilson.co.uk
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
(cherry picked from commit 27e897beec1c59861f15d4d3562c39ad1143620f)
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
2020-08-19 15:23:00 +03:00
Chris Wilson
810b7ee300 drm/i915/gt: Always report the sample time for busy-stats
Return the monotonic timestamp (ktime_get()) at the time of sampling the
busy-time. This is used in preference to taking ktime_get() separately
before or after the read seqlock as there can be some large variance in
reported timestamps. For selftests trying to ascertain that we are
reporting accurate to within a few microseconds, even a small delay
leads to the test failing.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200617130916.15261-2-chris@chris-wilson.co.uk
2020-06-18 09:26:54 +01:00
Arnd Bergmann
6ec81b8273 drm/i915/pmu: avoid an maybe-uninitialized warning
Conditional spinlocks make it hard for gcc and for lockdep to
follow the code flow. This one causes a warning with at least
gcc-9 and higher:

In file included from include/linux/irq.h:14,
                 from drivers/gpu/drm/i915/i915_pmu.c:7:
drivers/gpu/drm/i915/i915_pmu.c: In function 'i915_sample':
include/linux/spinlock.h:289:3: error: 'flags' may be used uninitialized in this function [-Werror=maybe-uninitialized]
  289 |   _raw_spin_unlock_irqrestore(lock, flags); \
      |   ^~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/gpu/drm/i915/i915_pmu.c:288:17: note: 'flags' was declared here
  288 |   unsigned long flags;
      |                 ^~~~~

Split out the part between the locks into a separate function
for readability and to let the compiler figure out what the
logic actually is.

Fixes: d79e1bd676f0 ("drm/i915/pmu: Only use exclusive mmio access for gen7")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200527140526.1458215-1-arnd@arndb.de
2020-05-27 17:07:24 +01:00
Pankaj Bharadiya
bf07f6ebff drm/i915/pmu: Prefer drm_WARN_ON over WARN_ON
struct drm_device specific drm_WARN* macros include device information
in the backtrace, so we know what device the warnings originate from.

Prefer drm_WARN_ON over WARN_ON.

Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200504181600.18503-8-pankaj.laxminarayan.bharadiya@intel.com
2020-05-19 16:01:34 +03:00
Chris Wilson
3b55cdeb8f drm/i915/pmu: Keep a reference to module while active
While a perf event is open, keep a reference to the module so we don't
remove the driver internals mid-sampling.

Testcase: igt/perf_pmu/module-unload
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200430183324.23984-1-chris@chris-wilson.co.uk
2020-05-01 09:24:34 +01:00
Chris Wilson
426d0073fb drm/i915/gt: Always enable busy-stats for execlists
In the near future, we will utilize the busy-stats on each engine to
approximate the C0 cycles of each, and use that as an input to a manual
RPS mechanism. That entails having busy-stats always enabled and so we
can remove the enable/disable routines and simplify the pmu setup. As a
consequence of always having the stats enabled, we can also show the
current active time via sysfs/engine/xcs/active_time_ns.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200429205446.3259-1-chris@chris-wilson.co.uk
2020-04-30 00:57:34 +01:00
Jani Nikula
1900aba567 drm/i915/pmu: prefer struct drm_device based logging
Prefer struct drm_device based logging over struct device based logging.

No functional changes.

Cc: Wambui Karuga <wambui.karugax@gmail.com>
Reviewed-by: Wambui Karuga <wambui.karugax@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200402114819.17232-12-jani.nikula@intel.com
2020-04-08 13:49:35 +03:00
Michał Winiarski
46129dc10f drm/i915/pmu: Avoid using globals for PMU events
Attempting to bind / unbind module from devices where we have both
integrated and discreete GPU handled by i915, will cause us to try and
double free the global state, hitting null ptr deref in free_event_attributes.

Let's move it to i915_pmu.

Fixes: 05488673a4d4 ("drm/i915/pmu: Support multiple GPUs")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200219161822.24592-2-michal.winiarski@intel.com
2020-02-21 17:31:15 +00:00
Michał Winiarski
f5a179d468 drm/i915/pmu: Avoid using globals for CPU hotplug state
Attempting to bind / unbind module from devices where we have both
integrated and discreete GPU handled by i915 can lead to leaks and
warnings from cpuhp:
Error: Removing state XXX which has instances left.

Let's move the state to i915_pmu.

Fixes: 05488673a4d4 ("drm/i915/pmu: Support multiple GPUs")
Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200219161822.24592-1-michal.winiarski@intel.com
2020-02-21 17:31:15 +00:00
Pankaj Bharadiya
48a1b8d4af drm/i915: Make WARN* drm specific where drm_priv ptr is available
drm specific WARN* calls include device information in the
backtrace, so we know what device the warnings originate from.

Covert all the calls of WARN* with device specific drm_WARN*
variants in functions where drm_i915_private struct pointer is readily
available.

The conversion was done automatically with below coccinelle semantic
patch. checkpatch errors/warnings are fixed manually.

@rule1@
identifier func, T;
@@
func(...) {
...
struct drm_i915_private *T = ...;
<+...
(
-WARN(
+drm_WARN(&T->drm,
...)
|
-WARN_ON(
+drm_WARN_ON(&T->drm,
...)
|
-WARN_ONCE(
+drm_WARN_ONCE(&T->drm,
...)
|
-WARN_ON_ONCE(
+drm_WARN_ON_ONCE(&T->drm,
...)
)
...+>
}

@rule2@
identifier func, T;
@@
func(struct drm_i915_private *T,...) {
<+...
(
-WARN(
+drm_WARN(&T->drm,
...)
|
-WARN_ON(
+drm_WARN_ON(&T->drm,
...)
|
-WARN_ONCE(
+drm_WARN_ONCE(&T->drm,
...)
|
-WARN_ON_ONCE(
+drm_WARN_ON_ONCE(&T->drm,
...)
)
...+>
}

command: ls drivers/gpu/drm/i915/*.c | xargs spatch --sp-file \
			<script> --linux-spacing --in-place

Signed-off-by: Pankaj Bharadiya <pankaj.laxminarayan.bharadiya@intel.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200115034455.17658-10-pankaj.laxminarayan.bharadiya@intel.com
2020-01-22 17:54:33 +02:00
Chris Wilson
f4e9894b69 drm/i915/pmu: Correct the rc6 offset upon enabling
The rc6 residency starts ticking from 0 from BIOS POST, but the kernel
starts measuring the time from its boot. If we start measuruing
I915_PMU_RC6_RESIDENCY while the GT is idle, we start our sampling from
0 and then upon first activity (park/unpark) add in all the rc6
residency since boot. After the first park with the sampler engaged, the
sleep/active counters are aligned.

v2: With a wakeref to be sure

Closes: https://gitlab.freedesktop.org/drm/intel/issues/973
Fixes: df6a42053513 ("drm/i915/pmu: Ensure monotonic rc6")
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200114105648.2172026-1-chris@chris-wilson.co.uk
2020-01-14 12:55:13 +00:00
Tvrtko Ursulin
aebf3b521b drm/i915/pmu: Do not use colons or dashes in PMU names
We use PCI device path in the registered PMU name in order to distinguish
between multiple GPUs. But since tools/perf reserves a special meaning to
dash and colon characters we need to transliterate them to something else.
We choose an underscore.

v2:
 * Use strreplace. (Chris)
 * Dashes are not good either. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reported-by: Dmitry Rogozhkin <dmitry.v.rogozhkin@intel.com>
Fixes: 05488673a4d4 ("drm/i915/pmu: Support multiple GPUs")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20200110113253.12535-1-tvrtko.ursulin@linux.intel.com
2020-01-13 10:30:17 +00:00
Tvrtko Ursulin
df6a420535 drm/i915/pmu: Ensure monotonic rc6
Avoid rc6 counter going backward in close to 0% RC6 scenarios like:

    15.005477996        114,246,613 ns   i915/rc6-residency/
    16.005876662            667,657 ns   i915/rc6-residency/
    17.006131417              7,286 ns   i915/rc6-residency/
    18.006615031 18,446,744,073,708,914,688 ns   i915/rc6-residency/
    19.007158361 18,446,744,073,709,447,168 ns   i915/rc6-residency/
    20.007806498                  0 ns   i915/rc6-residency/
    21.008227495          1,440,403 ns   i915/rc6-residency/

There are two aspects to this fix.

First is not assuming rc6 value zero means GT is asleep since that can
also mean GPU is fully busy and we do not want to enter the estimation
path in that case.

Second is ensuring monotonicity on the estimation path itself. I suspect
what is happening is with extremely rapid park/unpark cycles we get no
updates on the real rc6 and therefore have to careful not to
unconditionally trust use last known real rc6 when creating a new
estimation.

v2:
 * Simplify logic by not tracking the estimate but last reported value.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Fixes: 16ffe73c186b ("drm/i915/pmu: Use GT parked for estimating RC6 while asleep")
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk> # v1
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191217142057.1000-1-tvrtko.ursulin@linux.intel.com
2019-12-18 15:23:41 +00:00
Chris Wilson
edb1ecad77 drm/i915/pmu: Skip sampling engines if gt is asleep
If the whole GT is asleep, we know that each engine must also be asleep
and so we can quickly return without checking them all.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191218000756.3475668-1-chris@chris-wilson.co.uk
2019-12-18 11:09:15 +00:00
Andi Shyti
e03512edd2 drm/i915/rps: Add frequency translation helpers
Add two helpers that for reading the actual GT's frequency. The
two helpers are:

 - intel_rps_read_cagf: reads the frequency and returns it not
   normalized

 - intel_rps_read_actual_frequency: provides the frequency in Hz.

Use the above helpers in sysfs and debugfs.

Signed-off-by: Andi Shyti <andi.shyti@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191213183736.31992-2-andi@etezian.org
2019-12-13 22:22:05 +00:00
Tvrtko Ursulin
b66ecd0438 drm/i915/pmu: Report frequency as zero while GPU is sleeping
We used to report the minimum possible frequency as both requested and
active while GPU was in sleep state. This was a consequence of sampling
the value from the "current frequency" field in our software tracking.

This was strictly speaking wrong, but given that until recently the
current frequency in sleeping state used to be equal to minimum, it did
not stand out sufficiently to be noticed as such.

After some recent changes have made the current frequency be reported
as last active before GPU went to sleep, meaning both requested and active
frequencies could end up being reported at their maximum values for the
duration of the GPU idle state, it became much more obvious that this does
not make sense.

To fix this we will now sample the frequency counters only when the GPU is
awake. As a consequence reported frequencies could be reported as below
the GPU reported minimum but that should be much less confusing that the
current situation.

v2:
 * Split out early exit conditions for readability. (Chris)

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Closes: https://gitlab.freedesktop.org/drm/intel/issues/675
Link: https://patchwork.freedesktop.org/patch/msgid/20191129105436.20100-1-tvrtko.ursulin@linux.intel.com
2019-12-06 13:03:36 +00:00
Chris Wilson
07779a76ee drm/i915: Mark up the calling context for intel_wakeref_put()
Previously, we assumed we could use mutex_trylock() within an atomic
context, falling back to a worker if contended. However, such trickery
is illegal inside interrupt context, and so we need to always use a
worker under such circumstances. As we normally are in process context,
we can typically use a plain mutex, and only defer to a work when we
know we are being called from an interrupt path.

Fixes: 51fbd8de87dc ("drm/i915/pmu: Atomically acquire the gt_pm wakeref")
References: a0855d24fc22d ("locking/mutex: Complain upon mutex API misuse in IRQ contexts")
References: https://bugs.freedesktop.org/show_bug.cgi?id=111626
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191120125433.3767149-1-chris@chris-wilson.co.uk
2019-11-20 15:59:23 +00:00
Chris Wilson
e88866ef02 drm/i915/pmu: "Frequency" is reported as accumulated cycles
We report "frequencies" (actual-frequency, requested-frequency) as the
number of accumulated cycles so that the average frequency over that
period may be determined by the user. This means the units we report to
the user are Mcycles (or just M), not MHz.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: stable@vger.kernel.org
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191109105356.5273-1-chris@chris-wilson.co.uk
2019-11-11 11:59:12 +00:00
Chris Wilson
d79e1bd676 drm/i915/pmu: Only use exclusive mmio access for gen7
On gen7, we have to avoid concurrent access to the same mmio cacheline,
and so coordinate all mmio access with the uncore->lock. However, for
pmu, we want to avoid perturbing the system and disabling interrupts
unnecessarily, so restrict the w/a to gen7 where it is requied to
prevent machine hangs.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108103511.20951-2-chris@chris-wilson.co.uk
2019-11-08 11:37:03 +00:00
Chris Wilson
c1c82d267a drm/i915/pmu: Cheat when reading the actual frequency to avoid fw
We want to avoid taking forcewake when querying the performance stats,
as we wish to avoid perturbing the system under observation. (And with
the forcewake being kept alive for 1ms after use, sampling the frequency
from a 200Hz timer keeps forcewake 40% active.)

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191108103511.20951-1-chris@chris-wilson.co.uk
2019-11-08 11:36:10 +00:00
Andi Shyti
3e7abf8141 drm/i915: Extract GT render power state management
i915_irq.c is large. One reason for this is that has a large chunk of
the GT render power management stashed away in it. Extract that logic
out of i915_irq.c and intel_pm.c and put it under one roof.

Based on a patch by Chris Wilson.

Signed-off-by: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191024211642.7688-1-chris@chris-wilson.co.uk
2019-10-26 19:28:59 +01:00
Chris Wilson
c442292a66 drm/i915/pmu: Initialise the spinlock before registering
As the GT may be running in parallel with the module initialisation
code, we may enter i915_pmu_gt_parked() as we are executing
i915_pmu_register(). We have to init the spinlock before we mark
pmu.event_init so that it is available for use by i915_pmu_gt_parked()
(which may run as soon as event_init is set).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=112127
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191025165442.23356-1-chris@chris-wilson.co.uk
2019-10-25 22:56:14 +01:00
Chris Wilson
c6e07ada8e drm/i915/gt: Convert the leftover for_each_engine(gt)
Use the local gt for iterating over the available set of engines.

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191018115331.8980-1-chris@chris-wilson.co.uk
2019-10-18 14:53:48 +01:00
Tvrtko Ursulin
fb26eee060 drm/i915/pmu: Fix uninitialized variable on error path
If name allocation failed the log message will contain an uninitialized
error code which can be confusing.

Fixes: 05488673a4d4 ("drm/i915/pmu: Support multiple GPUs")
Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191018090514.1818-1-tvrtko.ursulin@linux.intel.com
[tursulin: Commit message spelling fix.]
2019-10-18 11:16:41 +01:00
Tvrtko Ursulin
05488673a4 drm/i915/pmu: Support multiple GPUs
With discrete graphics system can have both integrated and discrete GPU
handled by i915.

Currently we use a fixed name ("i915") when registering as the uncore PMU
provider which stops working in this case.

To fix this we add the PCI device name string to non-integrated devices
handled by us. Integrated devices keep the legacy name preserving
backward compatibility.

v2:
 * Detect IGP and keep legacy name. (Michal)
 * Use PCI device name as suffix. (Michal, Chris)

v3:
 * Constify the name. (Chris)
 * Use pci_domain_nr. (Chris)

v4:
 * Fix kfree_const usage. (Chris)

v5:
 * kfree_const does not work for modules. (Chris)
 * Changed is_igp helper to take i915.

Signed-off-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Michal Wajdeczko <michal.wajdeczko@intel.com>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20191016093802.12483-1-tvrtko.ursulin@linux.intel.com
2019-10-17 10:50:47 +01:00
Andi Shyti
c113236718 drm/i915: Extract GT render sleep (rc6) management
Continuing the theme of breaking intel_pm.c up in a reasonable chunk of
powermanagement utilities, pull out the rc6 setup into its GT handler.

Based on a patch by Chris Wilson.

Signed-off-by: Andi Shyti <andi.shyti@intel.com>
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/20190919143840.20384-1-andi.shyti@intel.com
Link: https://patchwork.freedesktop.org/patch/msgid/20190927110849.28734-1-chris@chris-wilson.co.uk
2019-09-27 13:01:57 +01:00