linux/drivers/pci/controller
Dexuan Cui c234ba8042 PCI: hv: Only reuse existing IRTE allocation for Multi-MSI
Jeffrey added Multi-MSI support to the pci-hyperv driver by the 4 patches:
08e61e861a ("PCI: hv: Fix multi-MSI to allow more than one MSI vector")
455880dfe2 ("PCI: hv: Fix hv_arch_irq_unmask() for multi-MSI")
b4b77778ec ("PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()")
a2bad844a6 ("PCI: hv: Fix interrupt mapping for multi-MSI")

It turns out that the third patch (b4b77778ec) causes a performance
regression because all the interrupts now happen on 1 physical CPU (or two
pCPUs, if one pCPU doesn't have enough vectors). When a guest has many PCI
devices, it may suffer from soft lockups if the workload is heavy, e.g.,
see https://lwn.net/ml/linux-kernel/20220804025104.15673-1-decui@microsoft.com/

Commit b4b77778ec itself is good. The real issue is that the hypercall in
hv_irq_unmask() -> hv_arch_irq_unmask() ->
hv_do_hypercall(HVCALL_RETARGET_INTERRUPT...) only changes the target
virtual CPU rather than physical CPU; with b4b77778ec, the pCPU is
determined only once in hv_compose_msi_msg() where only vCPU0 is specified;
consequently the hypervisor only uses 1 target pCPU for all the interrupts.

Note: before b4b77778ec, the pCPU is determined twice, and when the pCPU
is determined the second time, the vCPU in the effective affinity mask is
used (i.e., it isn't always vCPU0), so the hypervisor chooses different
pCPU for each interrupt.

The hypercall will be fixed in future to update the pCPU as well, but
that will take quite a while, so let's restore the old behavior in
hv_compose_msi_msg(), i.e., don't reuse the existing IRTE allocation for
single-MSI and MSI-X; for multi-MSI, we choose the vCPU in a round-robin
manner for each PCI device, so the interrupts of different devices can
happen on different pCPUs, though the interrupts of each device happen on
some single pCPU.

The hypercall fix may not be backported to all old versions of Hyper-V, so
we want to have this guest side change forever (or at least till we're sure
the old affected versions of Hyper-V are no longer supported).

Fixes: b4b77778ec ("PCI: hv: Reuse existing IRTE allocation in compose_msi_msg()")
Co-developed-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Signed-off-by: Jeffrey Hugo <quic_jhugo@quicinc.com>
Co-developed-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Carl Vanderlip <quic_carlv@quicinc.com>
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20221104222953.11356-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
2022-11-12 12:43:59 +00:00
..
cadence PCI: Convert to new *_PM_OPS macros 2022-07-27 11:56:17 -05:00
dwc Merge branch 'pci/qcom' 2022-10-05 17:32:57 -05:00
mobiveil PCI: Add defines for normal and subtractive PCI bridges 2022-02-17 15:29:35 -06:00
Kconfig arm64: bcmbca: Make BCM4908 drivers depend on ARCH_BCMBCA 2022-08-15 09:55:34 -07:00
Makefile Merge branch 'pci/host/mt7621' 2021-11-05 11:28:51 -05:00
pci-aardvark.c Merge branch 'remotes/lorenzo/pci/bridge-emul' 2022-10-05 17:32:55 -05:00
pci-ftpci100.c PCI: ftpci100: Use PCI_CONF1_ADDRESS() macro 2022-09-27 11:08:20 +02:00
pci-host-common.c PCI/MSI: Make pci_host_common_probe() declare its reliance on MSI domains 2021-04-20 14:11:22 +01:00
pci-host-generic.c
pci-hyperv-intf.c
pci-hyperv.c PCI: hv: Only reuse existing IRTE allocation for Multi-MSI 2022-11-12 12:43:59 +00:00
pci-ixp4xx.c ARM: ixp4xx: fix building both pci drivers 2021-08-12 23:10:09 +02:00
pci-loongson.c PCI: loongson: Work around LS7A incorrect Interrupt Pin registers 2022-07-21 12:42:00 -05:00
pci-mvebu.c Merge branch 'remotes/lorenzo/pci/mvebu' 2022-10-05 17:32:56 -05:00
pci-rcar-gen2.c PCI: rcar-gen2: Add RZ/N1 SOC family compatible string 2022-06-23 17:37:05 -05:00
pci-tegra.c Revert "PCI: tegra: Use PCI_CONF1_EXT_ADDRESS() macro" 2022-10-17 12:11:09 -05:00
pci-thunder-ecam.c PCI: thunder: Drop error data fabrication when config read fails 2021-11-18 13:39:32 -06:00
pci-thunder-pem.c PCI: thunder: Drop error data fabrication when config read fails 2021-11-18 13:39:32 -06:00
pci-v3-semi.c
pci-versatile.c PCI: versatile: Remove redundant variable retval 2022-04-28 10:46:37 +01:00
pci-xgene-msi.c PCI: xgene-msi: Use bitmap_zalloc() when applicable 2021-11-29 17:29:15 +00:00
pci-xgene.c PCI: Drop of_match_ptr() to avoid unused variables 2022-07-06 14:34:09 -05:00
pcie-altera-msi.c PCI: Bulk conversion to generic_handle_domain_irq() 2021-08-02 11:53:05 -05:00
pcie-altera.c Merge branch 'pci/driver-cleanup' 2022-01-13 09:57:53 -06:00
pcie-apple.c PCI: apple: Do not leak reset GPIO on unbind/unload/error 2022-09-14 17:45:47 +02:00
pcie-brcmstb.c PCI: brcmstb: Rename .map_bus() functions to end with 'map_bus' 2022-07-27 11:53:12 -05:00
pcie-hisi-error.c
pcie-iproc-bcma.c PCI: Add defines for normal and subtractive PCI bridges 2022-02-17 15:29:35 -06:00
pcie-iproc-msi.c PCI: iproc: Use bitmap API to allocate bitmaps 2022-07-05 15:02:56 -05:00
pcie-iproc-platform.c PCI: iproc: Rename iproc_pcie_pltfm_ to iproc_pltfm_pcie_ 2022-01-03 15:01:53 -06:00
pcie-iproc.c PCI: iproc: Set all 24 bits of PCI class code 2022-02-17 15:30:01 -06:00
pcie-iproc.h PCI: Fix kernel-doc formatting 2021-07-06 10:37:46 -05:00
pcie-mediatek-gen3.c PCI: mediatek-gen3: Change driver name to mtk-pcie-gen3 2022-08-23 14:58:49 +02:00
pcie-mediatek.c PCI: Convert to new *_PM_OPS macros 2022-07-27 11:56:17 -05:00
pcie-microchip-host.c PCI: microchip: Fix refcount leak in mc_pcie_init_irq_domains() 2022-06-08 15:26:24 -05:00
pcie-mt7621.c PCI: mt7621: Use PCI_CONF1_EXT_ADDRESS() macro 2022-09-27 11:08:20 +02:00
pcie-rcar-ep.c PCI: rcar-ep: Remove unneeded includes 2021-10-08 09:41:38 -05:00
pcie-rcar-host.c PCI: Convert to new *_PM_OPS macros 2022-07-27 11:56:17 -05:00
pcie-rcar.c
pcie-rcar.h PCI: rcar: Add L1 link state fix into data abort hook 2021-08-16 14:51:30 +01:00
pcie-rockchip-ep.c PCI: rockchip: Fix find_first_zero_bit() limit 2022-04-08 14:42:07 +01:00
pcie-rockchip-host.c PCI: Convert to new *_PM_OPS macros 2022-07-27 11:56:17 -05:00
pcie-rockchip.c
pcie-rockchip.h PCI: Add defines for normal and subtractive PCI bridges 2022-02-17 15:29:35 -06:00
pcie-xilinx-cpm.c PCI: xilinx-cpm: Add support for Versal CPM5 Root Port 2022-07-22 14:21:06 -05:00
pcie-xilinx-nwl.c PCI: xilinx-nwl: Simplify code and fix a memory leak 2021-12-01 09:26:51 +00:00
pcie-xilinx.c PCI: xilinx: Rename xilinx_pcie_port to xilinx_pcie 2022-01-03 15:05:28 -06:00
vmd.c PCI: vmd: Add DID 8086:7D0B and 8086:AD0B for Intel MTL SKUs 2022-06-28 18:36:12 -05:00