6304 Commits

Author SHA1 Message Date
Bjorn Helgaas
842b447f00 PCI/portdrv: Encapsulate pcie_ports_auto inside the port driver
"pcie_ports_auto" is only used inside the PCIe port driver itself, so
move it from include/linux/pci.h to portdrv.h so it's not visible to the
whole kernel.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-30 17:26:58 -05:00
Bjorn Helgaas
4c0fd7648d PCI/portdrv: Remove unnecessary "pcie_ports=auto" parameter
The "pcie_ports=auto" parameter set pcie_ports_disabled and pcie_ports_auto
to their compiled-in defaults, so specifying the parameter is the same as
not using it at all.

Remove the "pcie_ports=auto" parameter and update the documentation.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-30 17:26:57 -05:00
Bjorn Helgaas
1e447c57ae PCI/portdrv: Remove "pcie_hp=nomsi" kernel parameter
7570a333d8b0 ("PCI: Add pcie_hp=nomsi to disable MSI/MSI-X for pciehp
driver") added the "pcie_hp=nomsi" kernel parameter to work around this
error on shutdown:

  irq 16: nobody cared (try booting with the "irqpoll" option)
  Pid: 1081, comm: reboot Not tainted 3.2.0 #1
  ...
  Disabling IRQ #16

This happened on an unspecified system (possibly involving the Integrated
Device Technology, Inc. Device 807f bridge) where "an un-wanted interrupt
is generated when PCI driver switches from MSI/MSI-X to INTx while shutting
down the device."

The implication was that the device was buggy, but it is normal for a
device to use INTx after MSI/MSI-X have been disabled.  The only problem
was that the driver was still attached and it wasn't prepared for INTx
interrupts.  Prarit Bhargava fixed this issue with fda78d7a0ead ("PCI/MSI:
Stop disabling MSI/MSI-X in pci_device_shutdown()").

There is no automated way to set this parameter, so it's not very useful
for distributions or end users.  It's really only useful for debugging, and
we have "pci=nomsi" for that purpose.

Revert 7570a333d8b0 to remove the "pcie_hp=nomsi" parameter.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
CC: MUNEDA Takahiro <muneda.takahiro@jp.fujitsu.com>
CC: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
CC: Prarit Bhargava <prarit@redhat.com>
2018-03-30 17:26:56 -05:00
Bjorn Helgaas
1b64cb87cf PCI/portdrv: Remove unnecessary include of <linux/pci-aspm.h>
portdrv_pci.c doesn't use anything from <linux/pci-aspm.h>.  Remove the
include of it.  No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-03-30 17:26:55 -05:00
Bjorn Helgaas
02bfeb4842 PCI/portdrv: Simplify PCIe feature permission checking
Some PCIe features (AER, DPC, hotplug, PME) can be managed by either the
platform firmware or the OS, so the host bridge driver may have to request
permission from the platform before using them.  On ACPI systems, this is
done by negotiate_os_control() in acpi_pci_root_add().

The PCIe port driver later uses pcie_port_platform_notify() and
pcie_port_acpi_setup() to figure out whether it can use these features.
But all we need is a single bit for each service, so these interfaces are
needlessly complicated.

Simplify this by adding bits in the struct pci_host_bridge to show when the
OS has permission to use each feature:

  + unsigned int native_aer:1;       /* OS may use PCIe AER */
  + unsigned int native_hotplug:1;   /* OS may use PCIe hotplug */
  + unsigned int native_pme:1;       /* OS may use PCIe PME */

These are set when we create a host bridge, and the host bridge driver can
clear the bits corresponding to any feature the platform doesn't want us to
use.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-03-30 17:26:54 -05:00
Bjorn Helgaas
168f3ae595 PCI/portdrv: Remove unused PCIE_PORT_SERVICE_VC
No driver registers for PCIE_PORT_SERVICE_VC, so remove it.

This removes the VC "service" files from /sys/bus/pci_express/devices,
e.g., 0000:07:00.0:pcie108, 0000:08:04.0:pcie208 (all the files that
contained "8" as the last digit of the "pcieXXX" part).  The port driver
created these files for PCIe port devices that have a VC Capability.

Since this reduces PCIE_PORT_DEVICE_MAXSERVICES and moves DPC down into the
spot where VC used to be, the DPC sysfs files will now be named "pcieXX8".
I don't think there's anything useful userspace can do with those files, so
I hope nobody cares about these filenames.

There is no VC driver that calls pcie_port_service_register(), so there
never was a /sys/bus/pci_express/drivers/vc directory.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-03-30 17:26:54 -05:00
Bjorn Helgaas
c6c889d932 PCI/portdrv: Remove pcie_port_bus_type link order dependency
The pcie_port_bus_type must be registered before drivers that depend on it
can be registered.  Those drivers include:

  pcied_init()                # PCIe native hotplug driver
  aer_service_init()          # AER driver
  dpc_service_init()          # DPC driver
  pcie_pme_service_init()     # PME driver

Previously we registered pcie_port_bus_type from pcie_portdrv_init(), a
device_initcall.  The callers of pcie_port_service_register() (above) are
also device_initcalls.  This is fragile because the device_initcall
ordering depends on link order, which is not explicit.

Register pcie_port_bus_type from pci_driver_init() along with pci_bus_type.
This removes the link order dependency between portdrv and the pciehp, AER,
DPC, and PCIe PME drivers.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-03-30 17:26:53 -05:00
Bjorn Helgaas
79a011194b PCI/portdrv: Disable port driver in compat mode
The "pcie_ports=compat" kernel parameter sets pcie_ports_disabled, which is
intended to disable the PCIe port driver.  But even when it was disabled,
we registered pcie_portdriver so we could work around a BIOS PME issue (see
fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume")).

Registering the driver meant that the pcie_portdrv_probe() path called
pci_enable_device(), pci_save_state(), pm_runtime_set_autosuspend_delay(),
pm_runtime_use_autosuspend(), etc., even when the driver was disabled.

We've since moved the BIOS PME workaround from the port driver to the core,
so stop registering the PCIe port driver in compat mode.

This means "pcie_ports=compat" will now be basically the same as turning
off CONFIG_PCIEPORTBUS completely.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-03-30 17:26:52 -05:00
Bjorn Helgaas
3620c71484 PCI/PM: Clear PCIe PME Status bit for Root Complex Event Collectors
Per PCIe r4.0, sec 6.1.6, Root Complex Event Collectors can generate PME
interrupts on behalf of Root Complex Integrated Endpoints.

Linux does not currently enable PME interrupts from RC Event Collectors,
but fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume") suggests PME interrupts may be enabled by the platform for ACPI-
based runtime wakeup.

Clear the PCIe PME Status bit for Root Complex Event Collectors during
resume, just like we already do for Root Ports.

If the BIOS enables PME interrupts for an event collector and neglects to
clear the status bit on resume, this change should fix the same bug as
fe31e69740ed (PMEs not working after waking from a sleep state), but for
Root Complex Integrated Endpoints.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2018-03-30 17:26:51 -05:00
Tal Gilboa
b852f63aa6 PCI: Add pcie_bandwidth_capable() to compute max supported link bandwidth
Add pcie_bandwidth_capable() to compute the max link bandwidth supported by
a device, based on the max link speed and width, adjusted by the encoding
overhead.

The maximum bandwidth of the link is computed as:

  max_link_width * max_link_speed * (1 - encoding_overhead)

2.5 and 5.0 GT/s links use 8b/10b encoding, which reduces the raw bandwidth
available by 20%; 8.0 GT/s and faster links use 128b/130b encoding, which
reduces it by about 1.5%.

The result is in Mb/s, i.e., megabits/second, of raw bandwidth.

Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: add 16 GT/s, adjust for pcie_get_speed_cap() and
pcie_get_width_cap() signatures, don't export outside drivers/pci]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-30 15:33:38 -05:00
Tal Gilboa
c70b65fb7f PCI: Add pcie_get_width_cap() to find max supported link width
Add pcie_get_width_cap() to find the max link width supported by a device.
Change max_link_width_show() to use pcie_get_width_cap().

Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: return width directly instead of error and *width, don't export
outside drivers/pci]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2018-03-30 15:33:38 -05:00
Tal Gilboa
6cf57be0f7 PCI: Add pcie_get_speed_cap() to find max supported link speed
Add pcie_get_speed_cap() to find the max link speed supported by a device.
Change max_link_speed_show() to use pcie_get_speed_cap().

Signed-off-by: Tal Gilboa <talgi@mellanox.com>
[bhelgaas: return speed directly instead of error and *speed, don't export
outside drivers/pci]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
2018-03-30 15:33:38 -05:00
Dave Airlie
2b4f44eec2 Linux 4.16-rc7
-----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJauCZfAAoJEHm+PkMAQRiGWGUH/2rhdQDkoJpYWnjaQkolECG8
 MxpGE7nmIIHxQcbSDdHTGJ8IhVm6Z5wZ7ym/PwCDTT043Y1y341sJrIwL2/nTG6d
 HVidk8hFvgN6QzlzVAHT3ZZMII/V9Zt+VV5SUYLGnPAVuJNHo/6uzWlTU5g+NTFo
 IquFDdQUaGBlkKqby+NoAFnkV1UAIkW0g22cfvPnlO5GMer0gusGyVNvVp7TNj3C
 sqj4Hvt3RMDLMNe9RZ2pFTiOD096n8FWpYftZneUTxFImhRV3Jg5MaaYZm9SI3HW
 tXrv/LChT/F1mi5Pkx6tkT5Hr8WvcrwDMJ4It1kom10RqWAgjxIR3CMm448ileY=
 =YKUG
 -----END PGP SIGNATURE-----

Backmerge tag 'v4.16-rc7' into drm-next

Linux 4.16-rc7

This was requested by Daniel, and things were getting
a bit hard to reconcile, most of the conflicts were
trivial though.
2018-03-28 14:30:41 +10:00
Mika Westerberg
13d3047c81 ACPI / hotplug / PCI: Check presence of slot itself in get_slot_status()
Mike Lothian reported that plugging in a USB-C device does not work
properly in his Dell Alienware system.  This system has an Intel Alpine
Ridge Thunderbolt controller providing USB-C functionality.  In these
systems the USB controller (xHCI) is hotplugged whenever a device is
connected to the port using ACPI-based hotplug.

The ACPI description of the root port in question is as follows:

  Device (RP01)
  {
      Name (_ADR, 0x001C0000)

      Device (PXSX)
      {
          Name (_ADR, 0x02)

          Method (_RMV, 0, NotSerialized)
          {
              // ...
          }
      }

Here _ADR 0x02 means device 0, function 2 on the bus under root port (RP01)
but that seems to be incorrect because device 0 is the upstream port of the
Alpine Ridge PCIe switch and it has no functions other than 0 (the bridge
itself).  When we get ACPI Notify() to the root port resulting from
connecting a USB-C device, Linux tries to read PCI_VENDOR_ID from device 0,
function 2 which of course always returns 0xffffffff because there is no
such function and we never find the device.

In Windows this works fine.

Now, since we get ACPI Notify() to the root port and not to the PXSX device
we should actually start our scan from there as well and not from the
non-existent PXSX device.  Fix this by checking presence of the slot itself
(function 0) if we fail to do that otherwise.

While there use pci_bus_read_dev_vendor_id() in get_slot_status(), which is
the recommended way to read Device and Vendor IDs of devices on PCI buses.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=198557
Reported-by: Mike Lothian <mike@fireburn.co.uk>
Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: stable@vger.kernel.org
2018-03-23 16:11:00 -05:00
Lorenzo Pieralisi
745029187a PCI: pcie-xilinx-nwl: Fix mask value to disable MSIs
Compiling the xilinx-nwl driver with sparse checks result in the
following warning:

drivers/pci/host/pcie-xilinx-nwl.c:633:38: sparse: cast truncates bits
from constant value (ffffffff00000000 becomes 0)

Fix it by explicitly writing 0 to mask interrupts instead of relying
on a bogus cast applied to the mask bitwise complement.

Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Michal Simek <michal.simek@xilinx.com>
2018-03-23 11:16:21 +00:00
Fengguang Wu
6c994c504f PCI: v3-semi: Remove unnecessary semicolon
drivers/pci/host/pci-v3-semi.c:676:2-3: Unneeded semicolon

Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-03-22 12:24:31 +00:00
Fengguang Wu
d17086728c PCI: rcar: Remove unnecessary semicolon
Remove unneeded semicolon.

Generated by: scripts/coccinelle/misc/semicolon.cocci

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Signed-off-by: Julia Lawall <julia.lawall@lip6.fr>
Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2018-03-22 12:24:30 +00:00
Fengguang Wu
492d98e4f8 PCI: faraday: Make struct faraday_pci_variant static
This was generated from 0-day builder.

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
[lorenzo.pieralisi@arm.com: reworked/split patch]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>
2018-03-22 12:22:30 +00:00
Fengguang Wu
e734016dd3 PCI: kirin: Make struct kirin_pcie_driver static
This was generated from 0-day builder.

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
[robh: add commit msg]
Signed-off-by: Rob Herring <robh@kernel.org>
[lorenzo.pieralisi@arm.com: reworked the commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2018-03-22 10:14:41 +00:00
Jay Fang
1acfb9b7ee PCI: Add decoding for 16 GT/s link speed
PCIe 4.0 defines the 16.0 GT/s link speed.  Links can run at that speed
without any Linux changes, but previously their sysfs "max_link_speed" and
"current_link_speed" files contained "Unknown speed", not the expected
"16.0 GT/s".

Add decoding for the new 16 GT/s link speed.

Signed-off-by: Jay Fang <f.fangjian@huawei.com>
[bhelgaas: add PCI_EXP_LNKCAP2_SLS_16_0GB]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
2018-03-21 16:23:55 -05:00
Rob Herring
d2fd7344a9 PCI: kirin: Fix missing dependency on PCI_MSI_IRQ_DOMAIN
PCIE_DW_HOST depends on PCI_MSI_IRQ_DOMAIN and since kirin selects
PCIE_DW_HOST, it must also depend on PCI_MSI_IRQ_DOMAIN. This was found
by 0-day once building on all arches was enabled.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2018-03-21 14:22:45 +00:00
Rob Herring
a1b363a53f PCI: iproc: Remove dependency on ARM specific struct pci_sys_data
The iproc driver is using ARM's struct pci_sys_data simply to store a
private data pointer. This is completely unnecessary, so store the
private data directly in bus->sysdata as is done on arm64.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Scott Branden <scott.branden@broadcom.com>
2018-03-21 14:22:36 +00:00
Rob Herring
e4aa4ae77a PCI: kirin: Remove unnecessary asm/compiler.h include
compiler.h is unnecessary and doesn't exist on some arches, so remove
it.

Signed-off-by: Rob Herring <robh@kernel.org>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
2018-03-20 17:42:45 +00:00
Linus Torvalds
efac2483e8 Merge branch 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata
Pull libata fixes from Tejun Heo:
 "I sat on them too long and it's quite a few this late, but nothing has
  a wide blast area. The changes are...

   - Fix corner cases in SG command handling.

   - Recent introduction of default powersaving mode config option
     exposed several devices with broken powersaving behaviors. A number
     of patches to update the blacklist accordingly.

   - Fix a kernel panic on SAS hotplug.

   - Other misc and device specific updates"

* 'for-4.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  libata: Modify quirks for MX100 to limit NCQ_TRIM quirk to MU01 version
  libata: Make Crucial BX100 500GB LPM quirk apply to all firmware versions
  libata: Apply NOLPM quirk to Crucial M500 480 and 960GB SSDs
  libata: Enable queued TRIM for Samsung SSD 860
  PCI: Add function 1 DMA alias quirk for Highpoint RocketRAID 644L
  ahci: Add PCI-id for the Highpoint Rocketraid 644L card
  ata: do not schedule hot plug if it is a sas host
  libata: disable LPM for Crucial BX100 SSD 500GB drive
  libata: Apply NOLPM quirk to Crucial MX100 512GB SSDs
  libata: update documentation for sysfs interfaces
  ata: sata_rcar: Remove unused variable in sata_rcar_init_controller()
  libata: transport: cleanup documentation of sysfs interface
  sata_rcar: Reset SATA PHY when Salvator-X board resumes
  libata: don't try to pass through NCQ commands to non-NCQ devices
  libata: remove WARN() for DMA or PIO command without data
  libata: fix length validation of ATAPI-relayed SCSI commands
  ata: libahci: fix comment indentation
  ahci: Add check for device presence (PCIe hot unplug) in ahci_stop_engine()
  libata: Fix compile warning with ATA_DEBUG enabled
2018-03-19 14:23:30 -07:00
KarimAllah Ahmed
bf4447fd1c PCI/IOV: Skip BAR sizing for VFs
Per PCIe r4.0, sec 9.3.4.1.11, the BAR registers in VF config space are all
RO Zero, so skip sizing them.

This is an optimization when enabling SR-IOV on a device with many VFs.

Suggested-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-19 14:55:17 -05:00
Bjorn Helgaas
df62ab5e0f PCI: Tidy comments
Remove pointless comments that tell us the file name, remove blank line
comments, follow multi-line comment conventions.  No functional change
intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-19 14:20:43 -05:00
Bjorn Helgaas
3133e6dd07 PCI: Tidy Makefiles
Indent things so they line up neatly and remove extra blank lines and
superfluous comments.  No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-19 14:20:41 -05:00
Bjorn Helgaas
6846b3b512 PCI: Report quirks that take more than 10ms
With "initcall_debug", we report how long every PCI quirk took.

Even without "initcall_debug", report the runtime of any quirk that takes
longer than 10ms.  This is to make it easier to notice quirks that slow
down boot.

This was motivated by a report from Paul Menzel that PCI final quirks took
half a second at boot.

Link: https://lkml.kernel.org/r/44cada166e42007d27b4c3e3aa0744d7@molgen.mpg.de
Reported-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-19 13:08:38 -05:00
Bjorn Helgaas
d89bd9195d PCI: Report quirk timings with pci_info() instead of pr_debug()
With "initcall_debug", we report how long every PCI quirk took.  Previously
we used pr_debug(), which means you have to figure out how to enable debug
output.

Log these timings using pci_info() instead so it doesn't depend on DEBUG,
CONFIG_DYNAMIC_DEBUG, etc.

Also, don't log anything at all unless "initcall_debug" is specified.  This
matches what we do in do_one_initcall_debug().

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-19 13:08:38 -05:00
Bjorn Helgaas
f9ea894ca5 PCI/VPD: Move VPD structures to vpd.c
The VPD-related structures are only used in vpd.c, so move them from
drivers/pci/pci.h to vpd.c.  No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-19 13:06:34 -05:00
Bjorn Helgaas
996058573b PCI/VPD: Move VPD quirks to vpd.c
Move the VPD-related quirks from quirks.c to vpd.c, which removes the need
for struct pci_vpd outside vpd.c.  The goal is to encapsulate all the VPD
code and structures in vpd.c.

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-19 13:06:24 -05:00
Bjorn Helgaas
b1c615c48f PCI/VPD: Move VPD sysfs code to vpd.c
Move the VPD-related sysfs code from pci-sysfs.c to vpd.c.  This follows
the pattern of pcie_aspm_create_sysfs_dev_files().  The goal is to
encapsulate all the VPD code and structures in vpd.c.

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-19 13:06:17 -05:00
Bjorn Helgaas
f0eb77ae6b PCI/VPD: Move VPD access code to vpd.c
Move the VPD-related code from access.c to vpd.c.  The goal is to
encapsulate all the VPD code and structures in vpd.c.

No functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-19 13:06:11 -05:00
Manikanta Maddireddy
da76ba5096 PCI: tegra: Add power management support
Tegra186 powergate driver is implemented as power domain driver, power
partition ungate/gate are registered as power_on/power_off callback
functions. There are no direct functions to power gate/ungate host
controller in Tegra186. Host controller driver should add "power-domains"
property in device tree and implement runtime suspend and resume
callback functons. Power gate and ungate is taken care by power domain
driver when host controller driver calls pm_runtime_put_sync and
pm_runtime_get_sync respectively.

Register suspend_noirq & resume_noirq callback functions to allow PCIe to
come up after resume from RAM. Both runtime and noirq pm ops share same
callback functions.

Signed-off-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
[lorenzo.pieralisi@arm.com: squashed patch to fix compilation]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
Tested-by: Thierry Reding <treding@nvidia.com>
2018-03-19 09:37:46 +00:00
Dexuan Cui
948373b3ed PCI: hv: Only queue new work items in hv_pci_devices_present() if necessary
If there is pending work in hv_pci_devices_present() we just need to add
the new dr entry into the dr_list. Add a check to detect pending work
items and update the code to skip queuing work if pending work items
are detected.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
2018-03-16 18:19:03 +00:00
Dexuan Cui
fca288c015 PCI: hv: Remove the bogus test in hv_eject_device_work()
When kernel is executing hv_eject_device_work(), hpdev->state value must
be hv_pcichild_ejecting; any other value would consist in a bug,
therefore replace the bogus check with an explicit WARN_ON() on the
condition failure detection.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: updated commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
2018-03-16 18:19:02 +00:00
Dexuan Cui
df3f2159f4 PCI: hv: Fix a comment typo in _hv_pcifront_read_config()
Comment in _hv_pcifront_read_config() contains a typo, fix it.

No functional change.

Signed-off-by: Dexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: changed commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
2018-03-16 18:19:02 +00:00
Dexuan Cui
de0aa7b2f9 PCI: hv: Fix 2 hang issues in hv_compose_msi_msg()
1. With the patch "x86/vector/msi: Switch to global reservation mode",
the recent v4.15 and newer kernels always hang for 1-vCPU Hyper-V VM
with SR-IOV. This is because when we reach hv_compose_msi_msg() by
request_irq() -> request_threaded_irq() ->__setup_irq()->irq_startup()
-> __irq_startup() -> irq_domain_activate_irq() -> ... ->
msi_domain_activate() -> ... -> hv_compose_msi_msg(), local irq is
disabled in __setup_irq().

Note: when we reach hv_compose_msi_msg() by another code path:
pci_enable_msix_range() -> ... -> irq_domain_activate_irq() -> ... ->
hv_compose_msi_msg(), local irq is not disabled.

hv_compose_msi_msg() depends on an interrupt from the host.
With interrupts disabled, a UP VM always hangs in the busy loop in
the function, because the interrupt callback hv_pci_onchannelcallback()
can not be called.

We can do nothing but work it around by polling the channel. This
is ugly, but we don't have any other choice.

2. If the host is ejecting the VF device before we reach
hv_compose_msi_msg(), in a UP VM, we can hang in hv_compose_msi_msg()
forever, because at this time the host doesn't respond to the
CREATE_INTERRUPT request. This issue exists the first day the
pci-hyperv driver appears in the kernel.

Luckily, this can also by worked around by polling the channel
for the PCI_EJECT message and hpdev->state, and by checking the
PCI vendor ID.

Note: actually the above 2 issues also happen to a SMP VM, if
"hbus->hdev->channel->target_cpu == smp_processor_id()" is true.

Fixes: 4900be83602b ("x86/vector/msi: Switch to global reservation mode")
Tested-by: Adrian Suhov <v-adsuho@microsoft.com>
Tested-by: Chris Valean <v-chvale@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: <stable@vger.kernel.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
2018-03-16 18:19:01 +00:00
Dexuan Cui
021ad274d7 PCI: hv: Serialize the present and eject work items
When we hot-remove the device, we first receive a PCI_EJECT message and
then receive a PCI_BUS_RELATIONS message with bus_rel->device_count == 0.

The first message is offloaded to hv_eject_device_work(), and the second
is offloaded to pci_devices_present_work(). Both the paths can be running
list_del(&hpdev->list_entry), causing general protection fault, because
system_wq can run them concurrently.

The patch eliminates the race condition.

Since access to present/eject work items is serialized, we do not need the
hbus->enum_sem anymore, so remove it.

Fixes: 4daace0d8ce8 ("PCI: hv: Add paravirtual PCI front-end for Microsoft Hyper-V VMs")
Link: https://lkml.kernel.org/r/KL1P15301MB00064DA6B4D221123B5241CFBFD70@KL1P15301MB0006.APCP153.PROD.OUTLOOK.COM
Tested-by: Adrian Suhov <v-adsuho@microsoft.com>
Tested-by: Chris Valean <v-chvale@microsoft.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
[lorenzo.pieralisi@arm.com: squashed semaphore removal patch]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Acked-by: Haiyang Zhang <haiyangz@microsoft.com>
Cc: <stable@vger.kernel.org> # v4.6+
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Jack Morgenstein <jackm@mellanox.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
2018-03-16 18:18:50 +00:00
Arnd Bergmann
bb9d812643 arch: remove tile port
The Tile architecture port was added by Chris Metcalf in 2010, and
maintained until early 2018 when he orphaned it due to his departure
from Mellanox, and nobody else stepped up to maintain it. The product
line is still around in the form of the BlueField SoC, but no longer
uses the Tile architecture.

There are also still products for sale with Tile-GX SoCs, notably the
Mikrotik CCR router family. The products all use old (linux-3.3) kernels
with lots of patches and won't be upgraded by their manufacturers. There
have been efforts to port both OpenWRT and Debian to these, but both
projects have stalled and are very unlikely to be continued in the future.

Given that we are reasonably sure that nobody is still using the port
with an upstream kernel any more, it seems better to remove it now while
the port is in a good shape than to let it bitrot for a few years first.

Cc: Chris Metcalf <chris.d.metcalf@gmail.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Link: http://www.mellanox.com/page/npu_multicore_overview
Link: https://jenkins.debian.net/view/rebootstrap/job/rebootstrap_tilegx_gcc7/
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2018-03-16 10:56:03 +01:00
Arnd Bergmann
6de3ff9000 PCI: tegra: Add PCI_MSI_IRQ_DOMAIN kconfig dependency
Building the tegra PCIe host driver without MSI results in a link
failure:

drivers/pci/host/pci-tegra.o:(.data+0x70): undefined reference to
`pci_msi_unmask_irq'
drivers/pci/host/pci-tegra.o:(.data+0x74): undefined reference to
`pci_msi_mask_irq'

This adds the same dependency that everyone else uses.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
[lorenzo.pieralisi@arm.com: rewrote commit log]
Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Acked-by: Thierry Reding <treding@nvidia.com>
2018-03-14 18:30:09 +00:00
Lukas Wunner
07f4f97d7b vga_switcheroo: Use device link for HDA controller
Back in 2013, runtime PM for GPUs with integrated HDA controller was
introduced with commits 0d69704ae348 ("gpu/vga_switcheroo: add driver
control power feature. (v3)") and 246efa4a072f ("snd/hda: add runtime
suspend/resume on optimus support (v4)").

Briefly, the idea was that the HDA controller is forced on and off in
unison with the GPU.

The original code is mostly still in place even though it was never a
100% perfect solution:  E.g. on access to the HDA controller, the GPU
is powered up via vga_switcheroo_runtime_resume_hdmi_audio() but there
are no provisions to keep it resumed until access to the HDA controller
has ceased:  The GPU autosuspends after 5 seconds, rendering the HDA
controller inaccessible.

Additionally, a kludge is required when hda_intel.c probes:  It has to
check whether the GPU is powered down (check_hdmi_disabled()) and defer
probing if so.

However in the meantime (in v4.10) the driver core has gained a feature
called device links which promises to solve such issues in a clean way:
It allows us to declare a dependency from the HDA controller (consumer)
to the GPU (supplier).  The PM core then automagically ensures that the
GPU is runtime resumed as long as the HDA controller's ->probe hook is
executed and whenever the HDA controller is accessed.

By default, the HDA controller has a dependency on its parent, a PCIe
Root Port.  Adding a device link creates another dependency on its
sibling:

                            PCIe Root Port
                             ^          ^
                             |          |
                             |          |
                            HDA  ===>  GPU

The device link is not only used for runtime PM, it also guarantees that
on system sleep, the HDA controller suspends before the GPU and resumes
after the GPU, and on system shutdown the HDA controller's ->shutdown
hook is executed before the one of the GPU.  It is a complete solution.

Using this functionality is as simple as calling device_link_add(),
which results in a dmesg entry like this:

        pci 0000:01:00.1: Linked as a consumer to 0000:01:00.0

The code for the GPU-governed audio power management can thus be removed
(except where it's still needed for legacy manual power control).

The device link is added in a PCI quirk rather than in hda_intel.c.
It is therefore legal for the GPU to runtime suspend to D3cold even if
the HDA controller is not bound to a driver or if CONFIG_SND_HDA_INTEL
is not enabled, for accesses to the HDA controller will cause the GPU to
wake up regardless if they're occurring outside of hda_intel.c (think
config space readout via sysfs).

Contrary to the previous implementation, the HDA controller's power
state is now self-governed, rather than GPU-governed, whereas the GPU's
power state is no longer fully self-governed.  (The HDA controller needs
to runtime suspend before the GPU can.)

It is thus crucial that runtime PM is always activated on the HDA
controller even if CONFIG_SND_HDA_POWER_SAVE_DEFAULT is set to 0 (which
is the default), lest the GPU stays awake.  This is achieved by setting
the auto_runtime_pm flag on every codec and the AZX_DCAPS_PM_RUNTIME
flag on the HDA controller.

A side effect is that power consumption might be reduced if the GPU is
in use but the HDA controller is not, because the HDA controller is now
allowed to go to D3hot.  Before, it was forced to stay in D0 as long as
the GPU was in use.  (There is no reduction in power consumption on my
Nvidia GK107, but there might be on other chips.)

The code paths for legacy manual power control are adjusted such that
runtime PM is disabled during power off, thereby preventing the PM core
from resuming the HDA controller.

Note that the device link is not only added on vga_switcheroo capable
systems, but for *any* GPU with integrated HDA controller.  The idea is
that the HDA controller streams audio via connectors located on the GPU,
so the GPU needs to be on for the HDA controller to do anything useful.

This commit implicitly fixes an unbalanced runtime PM ref upon unbind of
hda_intel.c:  On ->probe, a runtime PM ref was previously released under
the condition "azx_has_pm_runtime(chip) || hda->use_vga_switcheroo", but
on ->remove a runtime PM ref was only acquired under the first of those
conditions.  Thus, binding and unbinding the driver twice on a
vga_switcheroo capable system caused the runtime PM refcount to drop
below zero.  The issue is resolved because the AZX_DCAPS_PM_RUNTIME flag
is now always set if use_vga_switcheroo is true.

For more information on device links please refer to:
https://www.kernel.org/doc/html/latest/driver-api/device_link.html
Documentation/driver-api/device_link.rst

Cc: Dave Airlie <airlied@redhat.com>
Cc: Ben Skeggs <bskeggs@redhat.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Takashi Iwai <tiwai@suse.de>
Reviewed-by: Peter Wu <peter@lekensteyn.nl>
Tested-by: Kai Heng Feng <kai.heng.feng@canonical.com> # AMD PowerXpress
Tested-by: Mike Lothian <mike@fireburn.co.uk>          # AMD PowerXpress
Tested-by: Denis Lisov <dennis.lissov@gmail.com>       # Nvidia Optimus
Tested-by: Peter Wu <peter@lekensteyn.nl>              # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de>              # MacBook Pro
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/51bd38360ff502a8c42b1ebf4405ee1d3f27118d.1520068884.git.lukas@wunner.de
2018-03-13 22:58:09 +01:00
Lukas Wunner
2a4d2c4240 PCI: Make pci_wakeup_bus() & pci_bus_set_current_state() public
There are PCI devices which are power-manageable by a nonstandard means,
such as a custom ACPI method.  One example are discrete GPUs in hybrid
graphics laptops, another are Thunderbolt controllers in Macs.

Such devices can't be put into D3cold with pci_set_power_state() because
pci_platform_power_transition() fails with -ENODEV.  Instead they're put
into D3hot by pci_set_power_state() and subsequently into D3cold by
invoking the nonstandard means.  However as a consequence the cached
current_state is incorrectly left at D3hot.

What we need to do is walk the hierarchy below such a PCI device on
powerdown and update the current_state to D3cold.  On powerup the PCI
device itself and the hierarchy below it is in D0uninitialized, so we
need to walk the hierarchy again and wake all devices, causing them to
be put into D0active and then letting them autosuspend as they see fit.

To this end make pci_wakeup_bus() & pci_bus_set_current_state() public
so PCI drivers don't have to reinvent the wheel.

Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/2962443259e7faec577274b4ef8c54aad66f9a94.1520068884.git.lukas@wunner.de
2018-03-13 22:57:36 +01:00
Rafael J. Wysocki
5775b843a6 PCI: Restore config space on runtime resume despite being unbound
We leave PCI devices not bound to a driver in D0 during runtime suspend.
But they may have a parent which is bound and can be transitioned to
D3cold at runtime.  Once the parent goes to D3cold, the unbound child
may go to D3cold as well.  When the child goes to D3cold, its internal
state, including configuration of BARs, MSI, ASPM, MPS, etc., is lost.

One example are recent hybrid graphics laptops which cut power to the
discrete GPU when the root port above it goes to ACPI power state D3.
Users may provoke this by unbinding the GPU driver and allowing runtime
PM on the GPU via sysfs:  The PM core will then treat the GPU as
"suspended", which in turn allows the root port to runtime suspend,
causing the power resources listed in its _PR3 object to be powered off.
The GPU's BARs will be uninitialized when a driver later probes it.

Another example are hybrid graphics laptops where the GPU itself (rather
than the root port) is capable of runtime suspending to D3cold.  If the
GPU's integrated HDA controller is not bound and the GPU's driver
decides to runtime suspend to D3cold, the HDA controller's BARs will be
uninitialized when a driver later probes it.

Fix by saving and restoring config space over a runtime suspend cycle
even if the device is not bound.

Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Peter Wu <peter@lekensteyn.nl>              # Nvidia Optimus
Tested-by: Lukas Wunner <lukas@wunner.de>              # MacBook Pro
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
[lukas: add commit message, bikeshed code comments for clarity]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Link: https://patchwork.freedesktop.org/patch/msgid/92fb6e6ae2730915eb733c08e2f76c6a313e3860.1520068884.git.lukas@wunner.de
2018-03-13 22:56:44 +01:00
Simon Guo
97c6f25d58 PCI/hotplug: ppc: correct a php_slot usage after free
In pnv_php_unregister_one(), pnv_php_put_slot() might kfree
php_slot structure. But there is pci_hp_deregister() after
that with php_slot reference.

This patch moves pnv_php_put_slot() to the end of function.

Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-13 15:50:32 +11:00
Bjorn Helgaas
a39bd851dc PCI/PM: Clear PCIe PME Status bit in core, not PCIe port driver
fe31e69740ed ("PCI/PCIe: Clear Root PME Status bits early during system
resume") added a .resume_noirq() callback to the PCIe port driver to clear
the PME Status bit during resume to work around a BIOS issue.

The BIOS evidently enabled PME interrupts for ACPI-based runtime wakeups
but did not clear the PME Status bit during resume, which meant PMEs after
resume did not trigger interrupts because PME Status did not transition
from cleared to set.

The fix was in the PCIe port driver, so it worked when CONFIG_PCIEPORTBUS
was set.  But I think we *always* want the fix because the platform may use
PME interrupts even if Linux is built without the PCIe port driver.

Move the fix from the port driver to the PCI core so we can work around
this "PME doesn't work after waking from a sleep state" issue regardless of
CONFIG_PCIEPORTBUS.

[bhelgaas: folded in warning fix from Arnd Bergmann <arnd@arndb.de>:
https://lkml.kernel.org/r/20180328134747.2062348-1-arnd@arndb.de]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2018-03-12 09:16:20 -05:00
Bjorn Helgaas
dcb0453d71 PCI/PM: Move pcie_clear_root_pme_status() to core
Move pcie_clear_root_pme_status() from the port driver to the PCI core so
it will be available even when the port driver isn't present.  No
functional change intended.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-03-12 09:15:39 -05:00
Linus Torvalds
c68a2cf07a pci-v4.16-fixes-3
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAlqiut4UHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vx64g//XKYaUwlc6e2c0XxTmpOe/v/V/NF3
 nXCvkmZRb2GW3xsYyrzzHBeUJtgRVBZ58yQtw7jXqS8DxyXiGzGEGrCHOzWOXiFm
 JppL24KgdZk/Az4Fo1rN8h6bydeeTbkztLJBzBDQghmeaj7VQiI+aePjUAFmOvd/
 rBg25nERo3yI/bPh6lMjLwZp3e89+cKn7na9FX8xKt32T2YSVqnjYFn6xoiC6UMY
 9Gaferv5ZECdTViq9lZKHTqg8f4Xjc/ln3qYMs/CXf3qNb6gt1gtA8f5X78z7l24
 h3MIwe7nxX0iyKjUImQeRSsXcSvDYiZHqwHJilm5TKmvwl+r3OB5D6YsfMRP2GN8
 CRNiNUQApxGRqo+NJOoR1Ca+tkWjfnFClOMzDC1b0zwKBlLSyHGSUKIsoQ160djs
 8aZZqKiVLeHgZrEWrb1601VeOjxDzdNfLn0/8fMsDS4avhoAUVL/Rqze8nowaDcM
 98AJuGu8of3tWfz7A7WiBn9Y17R8tIYwdjAtL6O9YlhdpiazUkLo9HmquKXK5Yv7
 2Q5+uoMPXsVN5OXKGKOXWTo1YarqQRK3c1iTK9kb8wolkKTcuSN62wHDtsxWzDCG
 VxEwjz/yrbiNiTqznT4iSlXvlUwPl65W0+TFnLRh0LFXtjUAHiYcL5qEe3ah7TWc
 vf1Mc1v8SJiUtmU=
 =2F4R
 -----END PGP SIGNATURE-----

Merge tag 'pci-v4.16-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci

Pull PCI fixes from Bjorn Helgaas:

 - fix sparc build issue when OF_IRQ not enabled (Guenter Roeck)

 - fix enumeration of devices below switches on DesignWare-based
   controllers (Koen Vandeputte)

* tag 'pci-v4.16-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  PCI: dwc: Fix enumeration end when reaching root subordinate
  PCI: Move of_irq_parse_and_map_pci() declaration under OF_IRQ
2018-03-09 13:31:08 -08:00
Bjorn Helgaas
ef7942603e PCI/portdrv: Merge pcieport_if.h into portdrv.h
pcieport_if.h contained the interfaces to register port service driver,
e.g., pcie_port_service_register().  portdrv.h contained internal data
structures of the port driver.

I don't think it's worth keeping those files separate, since both headers
and their users are all inside the PCI core.

Merge pcieport_if.h directly in drivers/pci/pcie/portdrv.h and update the
users to include that instead.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2018-03-09 11:42:01 -06:00
Jan Kiszka
690f430410 PCI: Scan all functions when running over Jailhouse
Per PCIe r4.0, sec 7.5.1.1.9, multi-function devices are required to have a
function 0.  Therefore, Linux scans for devices at function 0 (devfn
0/8/16/...) and only scans for other functions if function 0 has its
Multi-Function Device bit set or ARI or SR-IOV indicate there are more
functions.

The Jailhouse hypervisor may pass individual functions of a multi-function
device to a guest without passing function 0, which means a Linux guest
won't find them.

Change Linux PCI probing so it scans all function numbers when running as a
guest over Jailhouse.

This is technically prohibited by the spec, so it is possible that PCI
devices without the Multi-Function Device bit set may have unexpected
behavior in response to this probe.

Originally-by: Benedikt Spranger <b.spranger@linutronix.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Andy Shevchenko <andy.shevchenko@gmail.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: jailhouse-dev@googlegroups.com
Cc: Benedikt Spranger <b.spranger@linutronix.de>
Cc: linux-pci@vger.kernel.org
Cc: virtualization@lists.linux-foundation.org
Link: https://lkml.kernel.org/r/06e279b2a3e06cf6689ab3975f8ab592bba02362.1520408357.git.jan.kiszka@siemens.com
2018-03-08 12:30:37 +01:00