linux/drivers
Lukas Wunner 8a61449941 PCI: pciehp: Reduce noisiness on hot removal
When a PCIe card is hot-removed, the Presence Detect State and Data Link
Layer Link Active bits often do not clear simultaneously.  I've seen delays
of up to 244 msec between the two events with Thunderbolt.

After pciehp has brought down the slot in response to the first event, the
other bit may still be set.  It's not discernible whether it's set because
a new card is already in the slot or if it will soon clear.  So pciehp
tries to bring up the slot and in the latter case fails with a bunch of
messages, some of them at KERN_ERR severity.  If the slot is no longer
occupied, the messages are false positives and annoy users.

Stuart Hayes reports the following splat on hot removal:

  KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
  KERN_INFO pcieport 0000:3c:06.0: pciehp: Timeout waiting for Presence Detect
  KERN_ERR  pcieport 0000:3c:06.0: pciehp: link training error: status 0x0001
  KERN_ERR  pcieport 0000:3c:06.0: pciehp: Failed to check link status

Dongdong Liu complains about a similar splat:

  KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Link Down
  KERN_INFO iommu: Removing device 0000:87:00.0 from group 12
  KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
  KERN_INFO pcieport 0000:80:10.0: Data Link Layer Link Active not set in 1000 msec
  KERN_ERR  pciehp 0000:80:10.0:pcie004: Failed to check link status

Users are particularly irritated to see a bringup attempt even though the
slot was explicitly brought down via sysfs.  In a perfect world, we could
avoid this by setting Link Disable on slot bringdown and re-enabling it
upon a Presence Detect State change.  In reality however, there are broken
hotplug ports which hardwire Presence Detect to zero, see 80696f9914
("PCI: pciehp: Tolerate Presence Detect hardwired to zero").  Conversely,
PCIe r1.0 hotplug ports hardwire Link Active to zero because Link Active
Reporting wasn't specified before PCIe r1.1.  On unplug, some ports first
clear Presence then Link (see Stuart Hayes' splat) whereas others use the
inverse order (see Dongdong Liu's splat).  To top it off, there are hotplug
ports which flap the Presence and Link bits on slot bringup, see
6c35a1ac3d ("PCI: pciehp: Tolerate initially unstable link").

pciehp is designed to work with all of these variants.  Surplus attempts at
slot bringup are a lesser evil than not being able to bring up slots at
all.  Although we could try to perfect the behavior for specific hotplug
controllers, we'd risk breaking others or increasing code complexity.

But we can certainly minimize annoyance by emitting only a single message
with KERN_INFO severity if bringup is unsuccessful:

* Drop the "Timeout waiting for Presence Detect" message in
  pcie_wait_for_presence().  The sole caller of that function,
  pciehp_check_link_status(), ignores the timeout and carries on.  It emits
  error messages of its own and I don't think this particular message adds
  much value.

* There's a single error condition in pciehp_check_link_status() which
  does not emit a message.  Adding one allows dropping the "Failed to check
  link status" message emitted by board_added() if
  pciehp_check_link_status() returns a non-zero integer.

* Tone down all messages in pciehp_check_link_status() to KERN_INFO
  severity and rephrase them to look as innocuous as possible.  To this
  end, move the message emitted by pcie_wait_for_link_delay() to its
  callers.

As a result, Stuart Hayes' splat becomes:

  KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Link Up
  KERN_INFO pcieport 0000:3c:06.0: pciehp: Slot(180): Cannot train link: status 0x0001

Dongdong Liu's splat becomes:

  KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): Card present
  KERN_INFO pciehp 0000:80:10.0:pcie004: Slot(36): No link

The messages now merely serve as information that presence or link bits
were set a little longer than expected.  Bringup failures which are not
false positives are still reported, albeit no longer at KERN_ERR severity.

Link: https://lore.kernel.org/linux-pci/20200310182100.102987-1-stuart.w.hayes@gmail.com/
Link: https://lore.kernel.org/linux-pci/1547649064-19019-1-git-send-email-liudongdong3@huawei.com/
Link: https://lore.kernel.org/r/b45e46fd8a6aa6930aaac9d7718c2e4b787a4e5e.1595935071.git.lukas@wunner.de
Reported-by: Stuart Hayes <stuart.w.hayes@gmail.com>
Reported-by: Dongdong Liu <liudongdong3@huawei.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
2020-09-17 16:22:36 -05:00
..
accessibility TTY/Serial patches for 5.9-rc1 2020-08-06 14:56:11 -07:00
acpi More ACPI updates for 5.9-rc1 2020-08-15 08:18:22 -07:00
amba
android
ata
atm Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-08-05 20:13:21 -07:00
auxdisplay Minor cleanup for auxdisplay: 2020-08-06 18:09:34 -07:00
base More power management updates for 5.9-rc1 2020-08-07 13:13:09 -07:00
bcma bcma: gpio: Use irqchip template 2020-08-02 18:26:51 +03:00
block block-5.9-2020-08-14 2020-08-15 20:36:42 -07:00
bluetooth Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next 2020-07-31 15:11:52 -07:00
bus MIPS upates for v5.9 2020-08-06 10:54:07 -07:00
cdrom
char Linux 5.8 2020-08-11 11:58:31 +10:00
clk More ACPI updates for 5.9-rc1 2020-08-15 08:18:22 -07:00
clocksource - Core Frameworks 2020-08-15 08:09:38 -07:00
connector
counter
cpufreq cpufreq: intel_pstate: Implement passive mode with HWP enabled 2020-08-11 17:29:45 +02:00
cpuidle powerpc updates for 5.9 2020-08-07 10:33:50 -07:00
crypto virtio: fixes, features 2020-08-11 14:34:17 -07:00
dax libnvdimm for 5.9 2020-08-11 10:59:19 -07:00
dca
devfreq PM / devfreq: Fix the wrong end with semicolon 2020-07-30 17:22:58 +09:00
dio
dma Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 21:14:30 -07:00
dma-buf A set of locking fixes and updates: 2020-08-10 19:07:44 -07:00
edac Fixes for ie31200 driver that missed the first pull 2020-08-15 08:25:41 -07:00
eisa
extcon
firewire
firmware uaccess: add force_uaccess_{begin,end} helpers 2020-08-12 10:57:59 -07:00
fpga
fsi
gnss
gpio This is the bulk of GPIO changes for the v5.9 kernel cycle: 2020-08-05 12:56:27 -07:00
gpu pwm: Changes for v5.9-rc1 2020-08-14 16:00:09 -07:00
greybus
hid Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid 2020-08-10 16:33:54 -07:00
hsi
hv hyperv-fixes for 5.9-rc 2020-08-14 13:31:25 -07:00
hwmon pwm: Changes for v5.9-rc1 2020-08-14 16:00:09 -07:00
hwspinlock
hwtracing
i2c More ACPI updates for 5.9-rc1 2020-08-15 08:18:22 -07:00
i3c
ide
idle Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
iio
infiniband mm/gup: remove task_struct pointer for all gup code 2020-08-12 10:58:04 -07:00
input Cleanup, SECCOMP_FILTER support, message printing fixes, and other 2020-08-15 18:50:32 -07:00
interconnect Char/Misc driver patches for 5.9-rc1 2020-08-05 11:43:47 -07:00
iommu Merge branch 'akpm' (patches from Andrew) 2020-08-12 11:24:12 -07:00
ipack
irqchip The usual boring updates from the interrupt subsystem: 2020-08-04 18:11:58 -07:00
isdn
leds LEDs changes for 5.9-rc1. 2020-08-05 19:24:27 -07:00
lightnvm
macintosh powerpc updates for 5.9 2020-08-07 10:33:50 -07:00
mailbox iomap: constify ioreadX() iomem argument (as in generic implementation) 2020-08-14 19:56:57 -07:00
mcb
md block-5.9-2020-08-14 2020-08-15 20:36:42 -07:00
media IOMMU Updates for Linux v5.9 2020-08-11 14:13:24 -07:00
memory IOMMU Updates for Linux v5.9 2020-08-11 14:13:24 -07:00
memstick MMC core: 2020-08-05 13:23:24 -07:00
message
mfd - Core Frameworks 2020-08-15 08:09:38 -07:00
misc Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 21:14:30 -07:00
mmc This tree adds the sched_set_fifo*() encapsulation APIs to remove 2020-08-06 11:55:43 -07:00
most drivers: most: add USB adapter driver 2020-07-31 14:38:12 +02:00
mtd This pull request contains changes for JFFS2, UBI and UBIFS 2020-08-10 18:20:04 -07:00
mux
net rtl818x: constify ioreadX() iomem argument (as in generic implementation) 2020-08-14 19:56:57 -07:00
nfc
ntb ntb: intel: constify ioreadX() iomem argument (as in generic implementation) 2020-08-14 19:56:57 -07:00
nubus
nvdimm mm: add thp_size 2020-08-14 19:56:56 -07:00
nvme for-5.9/block-merge-20200804 2020-08-05 11:12:34 -07:00
nvmem
of MIPS upates for v5.9 2020-08-06 10:54:07 -07:00
opp Merge branch 'cpufreq/arm/linux-next' of git://git.kernel.org/pub/scm/linux/kernel/git/vireshk/pm 2020-08-04 12:44:53 +02:00
oprofile
parisc Merge branch 'parisc-5.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux 2020-08-12 12:41:15 -07:00
parport
pci PCI: pciehp: Reduce noisiness on hot removal 2020-09-17 16:22:36 -05:00
pcmcia
perf It looks like a smaller batch of clk updates this time around. In the core 2020-08-07 13:35:51 -07:00
phy
pinctrl This is the bulk of the pin control changes for the v5.9 2020-08-09 12:52:28 -07:00
platform linux-watchdog 5.9-rc1 tag 2020-08-12 12:13:44 -07:00
pnp
power power supply and reset changes for the v5.9 series 2020-08-07 21:27:37 -07:00
powercap This tree adds the sched_set_fifo*() encapsulation APIs to remove 2020-08-06 11:55:43 -07:00
pps
ps3
ptp ptp: only allow phase values lower than 1 period 2020-08-05 12:06:44 -07:00
pwm pwm: Changes for v5.9-rc1 2020-08-14 16:00:09 -07:00
rapidio rapidio/rio_mport_cdev: use array_size() helper in copy_{from,to}_user() 2020-08-12 10:58:01 -07:00
ras
regulator Merge remote-tracking branch 'regulator/for-5.9' into regulator-next 2020-07-30 23:27:08 +01:00
remoteproc remoteproc updates for v5.9 2020-08-11 11:17:45 -07:00
reset
rpmsg
rtc RTC for 5.9 2020-08-12 17:17:00 -07:00
s390 s390/pkey: remove redundant variable initialization 2020-08-11 18:16:31 +02:00
sbus
scsi SCSI misc on 20200814 2020-08-14 16:01:59 -07:00
sfi
sh iomap: constify ioreadX() iomem argument (as in generic implementation) 2020-08-14 19:56:57 -07:00
siox
slimbus
soc Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 21:14:30 -07:00
soundwire
spi sound updates for 5.9 2020-08-06 14:27:31 -07:00
spmi
ssb Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-08-05 20:13:21 -07:00
staging pci-v5.9-changes 2020-08-07 18:48:15 -07:00
target SCSI misc on 20200814 2020-08-14 16:01:59 -07:00
tc
tee
thermal - Core Frameworks 2020-08-15 08:09:38 -07:00
thunderbolt thunderbolt: merge fix for kunix_resource changes 2020-08-09 11:06:10 -07:00
tty TTY/Serial patches for 5.9-rc1 2020-08-06 14:56:11 -07:00
uio
usb media updates for v5.9-rc1 2020-08-07 13:00:53 -07:00
vdpa virtio: fixes, features 2020-08-11 14:34:17 -07:00
vfio VFIO updates for v5.9-rc1 2020-08-12 12:09:36 -07:00
vhost virtio: fixes, features 2020-08-11 14:34:17 -07:00
video pwm: Changes for v5.9-rc1 2020-08-14 16:00:09 -07:00
virt
virtio virtio: pci: constify ioreadX() iomem argument (as in generic implementation) 2020-08-14 19:56:57 -07:00
visorbus
vlynq
vme
w1
watchdog linux-watchdog 5.9-rc1 tag 2020-08-12 12:13:44 -07:00
xen xen: branch for v5.9-rc1b 2020-08-14 13:34:37 -07:00
zorro
Kconfig
Makefile