14 Commits

Author SHA1 Message Date
Lyude Paul
7cb95eeea6 drm/nouveau/i2c: Enable i2c pads & busses during preinit
It turns out that while disabling i2c bus access from software when the
GPU is suspended was a step in the right direction with:

commit 342406e4fbba ("drm/nouveau/i2c: Disable i2c bus access after
->fini()")

We also ended up accidentally breaking the vbios init scripts on some
older Tesla GPUs, as apparently said scripts can actually use the i2c
bus. Since these scripts are executed before initializing any
subdevices, we end up failing to acquire access to the i2c bus which has
left a number of cards with their fan controllers uninitialized. Luckily
this doesn't break hardware - it just means the fan gets stuck at 100%.

This also means that we've always been using our i2c busses before
initializing them during the init scripts for older GPUs, we just didn't
notice it until we started preventing them from being used until init.
It's pretty impressive this never caused us any issues before!

So, fix this by initializing our i2c pad and busses during subdev
pre-init. We skip initializing aux busses during pre-init, as those are
guaranteed to only ever be used by nouveau for DP aux transactions.

Signed-off-by: Lyude Paul <lyude@redhat.com>
Tested-by: Marc Meledandri <m.meledandri@gmail.com>
Fixes: 342406e4fbba ("drm/nouveau/i2c: Disable i2c bus access after ->fini()")
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-07-19 16:26:50 +10:00
Lyude Paul
342406e4fb drm/nouveau/i2c: Disable i2c bus access after ->fini()
For a while, we've had the problem of i2c bus access not grabbing
a runtime PM ref when it's being used in userspace by i2c-dev, resulting
in nouveau spamming the kernel log with errors if anything attempts to
access the i2c bus while the GPU is in runtime suspend. An example:

[  130.078386] nouveau 0000:01:00.0: i2c: aux 000d: begin idle timeout ffffffff

Since the GPU is in runtime suspend, the MMIO region that the i2c bus is
on isn't accessible. On x86, the standard behavior for accessing an
unavailable MMIO region is to just return ~0.

Except, that turned out to be a lie. While computers with a clean
concious will return ~0 in this scenario, some machines will actually
completely hang a CPU on certian bad MMIO accesses. This was witnessed
with someone's Lenovo ThinkPad P50, where sensors-detect attempting to
access the i2c bus while the GPU was suspended would result in a CPU
hang:

  CPU: 5 PID: 12438 Comm: sensors-detect Not tainted 5.0.0-0.rc4.git3.1.fc30.x86_64 #1
  Hardware name: LENOVO 20EQS64N17/20EQS64N17, BIOS N1EET74W (1.47 ) 11/21/2017
  RIP: 0010:ioread32+0x2b/0x30
  Code: 81 ff ff ff 03 00 77 20 48 81 ff 00 00 01 00 76 05 0f b7 d7 ed c3
  48 c7 c6 e1 0c 36 96 e8 2d ff ff ff b8 ff ff ff ff c3 8b 07 <c3> 0f 1f
  40 00 49 89 f0 48 81 fe ff ff 03 00 76 04 40 88 3e c3 48
  RSP: 0018:ffffaac3c5007b48 EFLAGS: 00000292 ORIG_RAX: ffffffffffffff13
  RAX: 0000000001111000 RBX: 0000000001111000 RCX: 0000043017a97186
  RDX: 0000000000000aaa RSI: 0000000000000005 RDI: ffffaac3c400e4e4
  RBP: ffff9e6443902c00 R08: ffffaac3c400e4e4 R09: ffffaac3c5007be7
  R10: 0000000000000004 R11: 0000000000000001 R12: ffff9e6445dd0000
  R13: 000000000000e4e4 R14: 00000000000003c4 R15: 0000000000000000
  FS:  00007f253155a740(0000) GS:ffff9e644f600000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00005630d1500358 CR3: 0000000417c44006 CR4: 00000000003606e0
  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
  Call Trace:
   g94_i2c_aux_xfer+0x326/0x850 [nouveau]
   nvkm_i2c_aux_i2c_xfer+0x9e/0x140 [nouveau]
   __i2c_transfer+0x14b/0x620
   i2c_smbus_xfer_emulated+0x159/0x680
   ? _raw_spin_unlock_irqrestore+0x1/0x60
   ? rt_mutex_slowlock.constprop.0+0x13d/0x1e0
   ? __lock_is_held+0x59/0xa0
   __i2c_smbus_xfer+0x138/0x5a0
   i2c_smbus_xfer+0x4f/0x80
   i2cdev_ioctl_smbus+0x162/0x2d0 [i2c_dev]
   i2cdev_ioctl+0x1db/0x2c0 [i2c_dev]
   do_vfs_ioctl+0x408/0x750
   ksys_ioctl+0x5e/0x90
   __x64_sys_ioctl+0x16/0x20
   do_syscall_64+0x60/0x1e0
   entry_SYSCALL_64_after_hwframe+0x49/0xbe
  RIP: 0033:0x7f25317f546b
  Code: 0f 1e fa 48 8b 05 1d da 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff
  ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01
  f0 ff ff 73 01 c3 48 8b 0d ed d9 0c 00 f7 d8 64 89 01 48
  RSP: 002b:00007ffc88caab68 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
  RAX: ffffffffffffffda RBX: 00005630d0fe7260 RCX: 00007f25317f546b
  RDX: 00005630d1598e80 RSI: 0000000000000720 RDI: 0000000000000003
  RBP: 00005630d155b968 R08: 0000000000000001 R09: 00005630d15a1da0
  R10: 0000000000000070 R11: 0000000000000246 R12: 00005630d1598e80
  R13: 00005630d12f3d28 R14: 0000000000000720 R15: 00005630d12f3ce0
  watchdog: BUG: soft lockup - CPU#5 stuck for 23s! [sensors-detect:12438]

Yikes! While I wanted to try to make it so that accessing an i2c bus on
nouveau would wake up the GPU as needed, airlied pointed out that pretty
much any usecase for userspace accessing an i2c bus on a GPU (mainly for
the DDC brightness control that some displays have) is going to only be
useful while there's at least one display enabled on the GPU anyway, and
the GPU never sleeps while there's displays running.

Since teaching the i2c bus to wake up the GPU on userspace accesses is a
good deal more difficult than it might seem, mostly due to the fact that
we have to use the i2c bus during runtime resume of the GPU, we instead
opt for the easiest solution: don't let userspace access i2c busses on
the GPU at all while it's in runtime suspend.

Changes since v1:
* Also disable i2c busses that run over DP AUX

Signed-off-by: Lyude Paul <lyude@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2019-05-01 11:08:39 +10:00
Ben Skeggs
56d06fa29e drm/nouveau/core: remove pmc_enable argument from subdev ctor
These are now specified directly in the MC subdev.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2016-05-20 14:43:04 +10:00
Ben Skeggs
49bd8da513 drm/nouveau/i2c: convert to new-style nvkm_subdev
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:43 +10:00
Ben Skeggs
3a8c3400f3 drm/nouveau/subdev: rename some functions to avoid upcoming conflicts
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:33 +10:00
Ben Skeggs
2aa5eac516 drm/nouveau/i2c: transition pad/ports away from being based on nvkm_object
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:29 +10:00
Ben Skeggs
7f5f518fd7 drm/nouveau/bios: remove object accessor functions
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:26 +10:00
Ben Skeggs
1cb57d25b6 drm/nouveau/i2c: switch to subdev printk macros
Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:22 +10:00
Ben Skeggs
5b920d9264 drm/nouveau/i2c: cosmetic changes
This is purely preparation for upcoming commits, there should be no
code changes here.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:08 +10:00
Ben Skeggs
9ace404b10 drm/nouveau/device: include core/device.h automatically for subdevs/engines
Pretty much every subdev/engine is going to need access to nvkm_device
shortly to touch registers and/or output messages.

The odd placement of the includes is necessary to work around some
inter-dependencies that currently exist.  This will be fixed later.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-08-28 12:40:07 +10:00
Ben Skeggs
78b2b4e76b drm/nouveau/instmem: namespace + nvidia gpu names (no binary change)
The namespace of NVKM is being changed to nvkm_ instead of nouveau_,
which will be used for the DRM part of the driver.  This is being
done in order to make it very clear as to what part of the driver a
given symbol belongs to, and as a minor step towards splitting the
DRM driver out to be able to stand on its own (for virt).

Because there's already a large amount of churn here anyway, this is
as good a time as any to also switch to NVIDIA's device and chipset
naming to ease collaboration with them.

A comparison of objdump disassemblies proves no code changes.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-01-22 12:17:55 +10:00
Ben Skeggs
b9ec14246d drm/nouveau/i2c: namespace + nvidia gpu names (no binary change)
The namespace of NVKM is being changed to nvkm_ instead of nouveau_,
which will be used for the DRM part of the driver.  This is being
done in order to make it very clear as to what part of the driver a
given symbol belongs to, and as a minor step towards splitting the
DRM driver out to be able to stand on its own (for virt).

Because there's already a large amount of churn here anyway, this is
as good a time as any to also switch to NVIDIA's device and chipset
naming to ease collaboration with them.

A comparison of objdump disassemblies proves no code changes.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-01-22 12:17:54 +10:00
Ben Skeggs
5025407b98 drm/nouveau/core: namespace + nvidia gpu names (no binary change)
The namespace of NVKM is being changed to nvkm_ instead of nouveau_,
which will be used for the DRM part of the driver.  This is being
done in order to make it very clear as to what part of the driver a
given symbol belongs to, and as a minor step towards splitting the
DRM driver out to be able to stand on its own (for virt).

Because there's already a large amount of churn here anyway, this is
as good a time as any to also switch to NVIDIA's device and chipset
naming to ease collaboration with them.

A comparison of objdump disassemblies proves no code changes.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-01-22 12:17:49 +10:00
Ben Skeggs
c39f472e9f drm/nouveau: remove symlinks, move core/ to nvkm/ (no code changes)
The symlinks were annoying some people, and they're not used anywhere
else in the kernel tree.  The include directory structure has been
changed so that symlinks aren't needed anymore.

NVKM has been moved from core/ to nvkm/ to make it more obvious as to
what the directory is for, and as some minor prep for when NVKM gets
split out into its own module (virt) at a later date.

Signed-off-by: Ben Skeggs <bskeggs@redhat.com>
2015-01-22 12:15:10 +10:00