linux

iv/linux

Author	SHA1	Message	Date
Bjorn Helgaas	d7493f273b	Merge branch 'pci/controller/layerscape' - Add ls1028a endpoint mode support (Xiaowei Bao) * pci/controller/layerscape: PCI: layerscape: Add EP mode support for ls1028a	2023-04-20 16:16:37 -05:00
Bjorn Helgaas	2ad2e01914	Merge branch 'pci/controller/kirin' - Select CONFIG_REGMAP_MMIO so kirin driver links correctly (Josh Triplett) * pci/controller/kirin: PCI: kirin: Select REGMAP_MMIO	2023-04-20 16:16:36 -05:00
Bjorn Helgaas	73af737eea	Merge branch 'pci/controller/ixp4xx' - Use the PCI_CONF1_ADDRESS() macro to simplify config space address computation (Pali Rohár) * pci/controller/ixp4xx: PCI: ixp4xx: Use PCI_CONF1_ADDRESS() macro	2023-04-20 16:16:36 -05:00
Bjorn Helgaas	0c78d418e9	Merge branch 'pci/controller/dwc' - Install i.MX6 PCI abort handler only when DT contains a PCI controller claimed by the imx6 driver (H. Nikolaus Schaller) * pci/controller/dwc: PCI: imx6: Install the fault handler only on compatible match	2023-04-20 16:16:35 -05:00
Bjorn Helgaas	1c03b5bfc5	Merge branch 'pci/resource' - Add pci_dev_for_each_resource() and pci_bus_for_each_resource() iterators to simplify loops (Andy Shevchenko) * pci/resource: EISA: Drop unused pci_bus_for_each_resource() index argument PCI: Make pci_bus_for_each_resource() index optional PCI: Document pci_bus_for_each_resource() PCI: Introduce pci_dev_for_each_resource() PCI: Introduce pci_resource_n()	2023-04-20 16:16:34 -05:00
Bjorn Helgaas	43ca31e002	Merge branch 'pci/reset' - Wait longer for devices to become ready after resume (as we do for reset) to accommodate Intel Titan Ridge xHCI devices (Mika Westerberg) - Drop pci_bridge_wait_for_secondary_bus() timeout parameter since all callers pass the same value (Mika Westerberg) - Extend D3hot delay for NVIDIA HDA controllers to avoid unrecoverable devices after a bus reset (Alex Williamson) * pci/reset: PCI/PM: Extend D3hot delay for NVIDIA HDA controllers PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameter PCI/PM: Increase wait time after resume	2023-04-20 16:16:33 -05:00
Bjorn Helgaas	cc8a983d0f	Merge branch 'pci/p2pdma' - Fix pci_p2pmem_find_many() kernel-doc (Cai Huoqing) * pci/p2pdma: PCI/P2PDMA: Fix pci_p2pmem_find_many() kernel-doc	2023-04-20 16:16:33 -05:00
Bjorn Helgaas	8745c3d542	Merge branch 'pci/hotplug' - Fix pciehp AB-BA deadlock between reset_lock and device_lock (Lukas Wunner) * pci/hotplug: PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock	2023-04-20 16:16:33 -05:00
Bjorn Helgaas	66d3d0d0e8	Merge branch 'pci/enumeration' - Use of_property_present(), instead of lower-level functions like of_get_property(), for testing DT property presence (Rob Herring) * pci/enumeration: PCI: Use of_property_present() for testing DT property presence	2023-04-20 16:16:32 -05:00
Rob Herring	0d21e71a91	PCI: Restrict device disabled status check to DT Commit `6fffbc7ae1` ("PCI: Honor firmware's device disabled status") checked the firmware device status for both DT and ACPI devices. That caused a regression in some ACPI systems. The exact reason isn't clear. It's possibly a firmware bug. For now, at least, refactor the check to be for DT based systems only. Note that the original implementation leaked a refcount which is now correctly handled. [bhelgaas: Per ACPI r6.5, sec 6.3.7, for devices on an enumerable bus, _STA must return with bit[0] ("device is present") set] Link: https://lore.kernel.org/all/m2fs9lgndw.fsf@gmail.com/ Fixes: `6fffbc7ae1` ("PCI: Honor firmware's device disabled status") Link: https://lore.kernel.org/r/20230419193513.708818-1-robh@kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=217317 Reported-by: Donald Hunter <donald.hunter@gmail.com> Reported-by: Vitaly Kuznetsov <vkuznets@redhat.com> Tested-by: Donald Hunter <donald.hunter@gmail.com> Tested-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Binbin Zhou <zhoubinbin@loongson.cn> Cc: Liu Peibao <liupeibao@loongson.cn> Cc: Huacai Chen <chenhuacai@loongson.cn>	2023-04-20 13:30:14 -05:00
Rob Herring	9195ee1a1f	PCI: Use of_property_present() for testing DT property presence It is preferred to use typed property access functions (i.e. of_property_read_<type> functions) rather than low-level of_get_property()/of_find_property() functions for reading properties. As part of this, convert of_get_property()/of_find_property() calls to the recently added of_property_present() helper when we just want to test for presence of a property and nothing more. Link: https://lore.kernel.org/r/20230310144719.1544443-1-robh@kernel.org Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> # pcie-mediatek	2023-04-18 16:01:37 -05:00
Lukas Wunner	cedf8d8a50	PCI/DOE: Relax restrictions on request and response size An upcoming user of DOE is CMA (Component Measurement and Authentication, PCIe r6.0 sec 6.31). It builds on SPDM (Security Protocol and Data Model): https://www.dmtf.org/dsp/DSP0274 SPDM message sizes are not always a multiple of dwords. To transport them over DOE without using bounce buffers, allow sending requests and receiving responses whose final dword is only partially populated. To be clear, PCIe r6.0 sec 6.30.1 specifies the Data Object Header 2 "Length" in dwords and pci_doe_send_req() and pci_doe_recv_resp() read/write dwords. So from a spec point of view, DOE is still specified in dwords and allowing non-dword request/response buffers is merely for the convenience of callers. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ming Li <ming4.li@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/151b1a6a1794afb65d941287ecbc032c5b8004b9.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:36:58 -07:00
Lukas Wunner	74e491e5d1	PCI/DOE: Make mailbox creation API private The PCI core has just been amended to create a pci_doe_mb struct for every DOE instance on device enumeration. CXL (the only in-tree DOE user so far) has been migrated to use those mailboxes instead of creating its own. That leaves pcim_doe_create_mb() and pci_doe_for_each_off() without any callers, so drop them. pci_doe_supports_prot() is now only used internally, so declare it static. pci_doe_destroy_mb() is no longer used as callback for devm_add_action(), so refactor it to accept a struct pci_doe_mb pointer instead of a generic void pointer. Because pci_doe_create_mb() is only called on device enumeration, i.e. before driver binding, the workqueue name never contains a driver name. So replace dev_driver_string() with dev_bus_name() when generating the workqueue name. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ming Li <ming4.li@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/64f614b6584982986c55d2c6229b4ee2b276dd59.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:36:58 -07:00
Lukas Wunner	ac04840350	PCI/DOE: Create mailboxes on device enumeration Currently a DOE instance cannot be shared by multiple drivers because each driver creates its own pci_doe_mb struct for a given DOE instance. For the same reason a DOE instance cannot be shared between the PCI core and a driver. Moreover, finding out which protocols a DOE instance supports requires creating a pci_doe_mb for it. If a device has multiple DOE instances, a driver looking for a specific protocol may need to create a pci_doe_mb for each of the device's DOE instances and then destroy those which do not support the desired protocol. That's obviously an inefficient way to do things. Overcome these issues by creating mailboxes in the PCI core on device enumeration. Provide a pci_find_doe_mailbox() API call to allow drivers to get a pci_doe_mb for a given (pci_dev, vendor, protocol) triple. This API is modeled after pci_find_capability() and can later be amended with a pci_find_next_doe_mailbox() call to iterate over all mailboxes of a given pci_dev which support a specific protocol. On removal, destroy the mailboxes in pci_destroy_dev(), after the driver is unbound. This allows drivers to use DOE in their ->remove() hook. On surprise removal, cancel ongoing DOE exchanges and prevent new ones from being scheduled. Thereby ensure that a hot-removed device doesn't needlessly wait for a running exchange to time out. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ming Li <ming4.li@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/40a6f973f72ef283d79dd55e7e6fddc7481199af.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:36:58 -07:00
Lukas Wunner	022b66f381	PCI/DOE: Allow mailbox creation without devres management DOE mailbox creation is currently only possible through a devres-managed API. The lifetime of mailboxes thus ends with driver unbinding. An upcoming commit will create DOE mailboxes upon device enumeration by the PCI core. Their lifetime shall not be limited by a driver. Therefore rework pcim_doe_create_mb() into the non-devres-managed pci_doe_create_mb(). Add pci_doe_destroy_mb() for mailbox destruction on device removal. Provide a devres-managed wrapper under the existing pcim_doe_create_mb() name. The error path of pcim_doe_create_mb() previously called xa_destroy() if alloc_ordered_workqueue() failed. That's unnecessary because the xarray is still empty at that point. It doesn't need to be destroyed until it's been populated by pci_doe_cache_protocols(). Arrange the error path of the new pci_doe_create_mb() accordingly. pci_doe_cancel_tasks() is no longer used as callback for devm_add_action(), so refactor it to accept a struct pci_doe_mb pointer instead of a generic void pointer. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ming Li <ming4.li@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/7c9a63867d70233c5e9d26cd8bf956742cd6d650.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:36:58 -07:00
Lukas Wunner	c8fc07abeb	PCI/DOE: Deduplicate mailbox flushing When a DOE mailbox is torn down, its workqueue is flushed once in pci_doe_flush_mb() through a call to flush_workqueue() and subsequently flushed once more in pci_doe_destroy_workqueue() through a call to destroy_workqueue(). Deduplicate by dropping flush_workqueue() from pci_doe_flush_mb(). Rename pci_doe_flush_mb() to pci_doe_cancel_tasks() to more aptly describe what it now does. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ming Li <ming4.li@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/1f009f60b326d1c6d776641d4b20aff27de0c234.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:36:58 -07:00
Lukas Wunner	0821ff8ed0	PCI/DOE: Make asynchronous API private A synchronous API for DOE has just been introduced. CXL (the only in-tree DOE user so far) was converted to use it instead of the asynchronous API. Consequently, pci_doe_submit_task() as well as the pci_doe_task struct are only used internally, so make them private. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ming Li <ming4.li@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/cc19544068483681e91dfe27545c2180cd09f931.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:36:58 -07:00
Lukas Wunner	62e8b17ffc	PCI/DOE: Provide synchronous API and use it internally The DOE API only allows asynchronous exchanges and forces callers to provide a completion callback. Yet all existing callers only perform synchronous exchanges. Upcoming commits for CMA (Component Measurement and Authentication, PCIe r6.0 sec 6.31) likewise require only synchronous DOE exchanges. Provide a synchronous pci_doe() API call which builds on the internal asynchronous machinery. Convert the internal pci_doe_discovery() to the new call. The new API allows submission of const-declared requests, necessitating the addition of a const qualifier in struct pci_doe_task. Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ming Li <ming4.li@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/0f444206da9615c56301fbaff459c0f45d27f122.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-18 10:36:58 -07:00
Alex Williamson	a5a6dd2624	PCI/PM: Extend D3hot delay for NVIDIA HDA controllers Assignment of NVIDIA Ampere-based GPUs have seen a regression since the below referenced commit, where the reduced D3hot transition delay appears to introduce a small window where a D3hot->D0 transition followed by a bus reset can wedge the device. The entire device is subsequently unavailable, returning -1 on config space read and is unrecoverable without a host reset. This has been observed with RTX A2000 and A5000 GPU and audio functions assigned to a Windows VM, where shutdown of the VM places the devices in D3hot prior to vfio-pci performing a bus reset when userspace releases the devices. The issue has roughly a 2-3% chance of occurring per shutdown. Restoring the HDA controller d3hot_delay to the effective value before the below commit has been shown to resolve the issue. NVIDIA confirms this change should be safe for all of their HDA controllers. Fixes: `3e347969a5` ("PCI/PM: Reduce D3hot delay with usleep_range()") Link: https://lore.kernel.org/r/20230413194042.605768-1-alex.williamson@redhat.com Reported-by: Zhiyi Guo <zhguo@redhat.com> Signed-off-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Tarun Gupta <targupta@nvidia.com> Cc: Abhishek Sahu <abhsahu@nvidia.com> Cc: Tarun Gupta <targupta@nvidia.com>	2023-04-17 15:53:51 -05:00
Michael Kelley	2c6ba42168	PCI: hv: Enable PCI pass-thru devices in Confidential VMs For PCI pass-thru devices in a Confidential VM, Hyper-V requires that PCI config space be accessed via hypercalls. In normal VMs, config space accesses are trapped to the Hyper-V host and emulated. But in a confidential VM, the host can't access guest memory to decode the instruction for emulation, so an explicit hypercall must be used. Add functions to make the new MMIO read and MMIO write hypercalls. Update the PCI config space access functions to use the hypercalls when such use is indicated by Hyper-V flags. Also, set the flag to allow the Hyper-V PCI driver to be loaded and used in a Confidential VM (a.k.a., "Isolation VM"). The driver has previously been hardened against a malicious Hyper-V host[1]. [1] https://lore.kernel.org/all/20220511223207.3386-2-parri.andrea@gmail.com/ Co-developed-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Dexuan Cui <decui@microsoft.com> Signed-off-by: Michael Kelley <mikelley@microsoft.com> Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://lore.kernel.org/r/1679838727-87310-13-git-send-email-mikelley@microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org>	2023-04-17 19:19:04 +00:00
Thomas Gleixner	e3c026be4d	PCI/MSI: Remove over-zealous hardware size check in pci_msix_validate_entries() pci_msix_validate_entries() validates the entries array which is handed in by the caller for a MSI-X interrupt allocation. Aside of consistency failures it also detects a failure when the size of the MSI-X hardware table in the device is smaller than the size of the entries array. That's wrong for the case of range allocations where the caller provides the minimum and the maximum number of vectors to allocate, when the hardware size is greater or equal than the mininum, but smaller than the maximum. Remove the hardware size check completely from that function and just ensure that the entires array up to the maximum size is consistent. The limitation and range checking versus the hardware size happens independently of that afterwards anyway because the entries array is optional. Fixes: `4644d22eb6` ("PCI/MSI: Validate MSI-X contiguous restriction early") Reported-by: David Laight <David.Laight@aculab.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/87v8i3sg62.ffs@tglx	2023-04-16 14:11:51 +02:00
Abel Vesa	6276a403c0	PCI: qcom: Add SM8550 PCIe support SM8550 requires two additional clocks for proper working. Add these two clocks as optional clocks (as only required by this platform) and compatible for this platform. While at it, let's also rename the reset variable to "rst" from "pci_reset" to match the existing naming preference. Link: https://lore.kernel.org/r/20230320144658.1794991-2-abel.vesa@linaro.org Signed-off-by: Abel Vesa <abel.vesa@linaro.org> [lpieralisi@kernel.org: commit log rewording] Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org>	2023-04-12 15:22:31 +02:00
Manivannan Sadhasivam	7394d0a85d	PCI: qcom: Add support for SDX55 SoC Add support for SDX55 SoC reusing the 1.9.0 config. The PCIe controller is of version 1.10.0 but it is compatible with the 1.9.0 config. This SoC also requires "sleep" clock which is added as an optional clock in the driver, since it is not required on other SoCs. Link: https://lore.kernel.org/r/20230308082424.140224-14-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-12 10:11:03 +02:00
Manivannan Sadhasivam	c0e1eb441b	PCI: qcom: Enable async probe by default Qcom PCIe RC driver waits for the PHY link to be up during the probe; this consumes several milliseconds during boot. Enable async probe by default so that other drivers can load in parallel while this driver waits for the link to be up. Suggested-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/r/20230320064644.5217-1-manivannan.sadhasivam@linaro.org Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org>	2023-04-12 08:49:04 +02:00
Manivannan Sadhasivam	ad9b9b6e36	PCI: qcom: Add support for system suspend and resume During the system suspend, vote for minimal interconnect bandwidth (1KiB) to keep the interconnect path active for config access and also turn OFF the resources like clock and PHY if there are no active devices connected to the controller. For the controllers with active devices, the resources are kept ON as removing the resources will trigger access violation during the late end of suspend cycle as kernel tries to access the config space of PCIe devices to mask the MSIs. Also, it is not desirable to put the link into L2/L3 state as that implies VDD supply will be removed and the devices may go into powerdown state. This will affect the lifetime of storage devices like NVMe. And finally, during resume, turn ON the resources if the controller was truly suspended (resources OFF) and update the interconnect bandwidth based on PCIe Gen speed. Suggested-by: Krishna chaitanya chundru <quic_krichai@quicinc.com> Link: https://lore.kernel.org/r/20230403154922.20704-2-manivannan.sadhasivam@linaro.org Tested-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Johan Hovold <johan+linaro@kernel.org> Acked-by: Dhruva Gole <d-gole@ti.com>	2023-04-12 08:48:53 +02:00
Mika Westerberg	e74b2b58ff	PCI/PM: Drop pci_bridge_wait_for_secondary_bus() timeout parameter All callers of pci_bridge_wait_for_secondary_bus() supply a timeout of PCIE_RESET_READY_POLL_MS, so drop the parameter. Move the definition of PCIE_RESET_READY_POLL_MS into pci.c, the only user. [bhelgaas: extracted from https://lore.kernel.org/r/20230404052714.51315-3-mika.westerberg@linux.intel.com] Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-04-11 17:35:06 -05:00
Mika Westerberg	e8b908146d	PCI/PM: Increase wait time after resume PCIe r6.0 sec 6.6.1 prescribes that a device must be able to respond to config requests within 1.0 s (PCI_RESET_WAIT) after exiting conventional reset and this same delay is prescribed when coming out of D3cold (as that involves reset too). A device that requires more than 1 second to initialize after reset may respond to config requests with Request Retry Status completions (sec 2.3.1), and we accommodate that in Linux with a 60 second cap (PCIE_RESET_READY_POLL_MS). Previously we waited up to PCIE_RESET_READY_POLL_MS only in the reset code path, not in the resume path. However, a device has surfaced, namely Intel Titan Ridge xHCI, which requires a longer delay also in the resume code path. Make the resume code path to use this same extended delay as the reset path. Link: https://bugzilla.kernel.org/show_bug.cgi?id=216728 Link: https://lore.kernel.org/r/20230404052714.51315-2-mika.westerberg@linux.intel.com Reported-by: Chris Chiu <chris.chiu@canonical.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Lukas Wunner <lukas@wunner.de>	2023-04-11 17:35:02 -05:00
Linus Torvalds	e62252bc55	pci-v6.3-fixes-2 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmQ1qFYUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vxRJA/+JnDrZ8Ag8sIGgI7meuqWcLe3dHXQ 7KaOGM/MP0ir8qJwiobuon2ml1lEU+iWZEcpFH9b9b5iurAfbvZkPhFAWZzJG3GW B6qEX5zaK4tNJHC8byK4pSat2pJLg8Q6xwA6c3H9264tj4Dc83dg2Y8hWWozbU2l xFaawRexCus7Xzd2VEfskwOpHQpSYkWCDtTFclIt1iZeelj6an/0FVdvYqvvIXi+ QZjS80tMeZIcKJq06023KJX36lZHf0RGzYzQTyr7asiNnXBnQ1DfYIASwx1Nf6Y3 9PpvoDVbUGnLDAB8THcOM9j9UOrhlbOJ+4b60J0b4nsT0ORn4rCrN05K0DIIXZi0 CsP3qR2l04xYZFElWrvnA30KS76G7vq0OeAGpQsE0nB1PgydXuiXk5lniQhsrOuj xRiY7YeDFwiNzpjWsVtqkANvl6tA53ZUn+AMEPfIpdYbjnbLcSSF/qY3WCeKoftx KUg6fVOoXbbSctK5rXQnj2yhwQZSkm3DrG9Ol2i3E524vcMKs28QBBuUN+gtL3TP aD6nYP2Q0gItYAQCMtJHCBzfstrO5l9JSLM6xUHR/GwiaLCC9+QRdLJFcAxZvZJi buGs09B2Xz8AbgQ3AejEw7UXk+4JP19ZLyznBhLaaQQO9YOvDYlcn3Xw1QGvJGY/ qsYKElgrLWSz+50= =2gjK -----END PGP SIGNATURE----- Merge tag 'pci-v6.3-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull pci fixes from Bjorn Helgaas: - Provide pci_msix_can_alloc_dyn() stub when CONFIG_PCI_MSI unset to avoid build errors (Reinette Chatre) - Quirk AMD XHCI controller that loses MSI-X state in D3hot to avoid broken USB after hotplug or suspend/resume (Basavaraj Natikar) - Fix use-after-free in pci_bus_release_domain_nr() (Rob Herring) * tag 'pci-v6.3-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: Fix use-after-free in pci_bus_release_domain_nr() x86/PCI: Add quirk for AMD XHCI controller that loses MSI-X state in D3hot PCI/MSI: Provide missing stub for pci_msix_can_alloc_dyn()	2023-04-11 11:59:49 -07:00
Lukas Wunner	f5eff5591b	PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock In 2013, commits `2e35afaefe` ("PCI: pciehp: Add reset_slot() method") `608c388122` ("PCI: Add slot reset option to pci_dev_reset()") amended PCIe hotplug to mask Presence Detect Changed events during a Secondary Bus Reset. The reset thus no longer causes gratuitous slot bringdown and bringup. However the commits neglected to serialize reset with code paths reading slot registers. For instance, a slot bringup due to an earlier hotplug event may see the Presence Detect State bit cleared during a concurrent Secondary Bus Reset. In 2018, commit `5b3f7b7d06` ("PCI: pciehp: Avoid slot access during reset") retrofitted the missing locking. It introduced a reset_lock which serializes a Secondary Bus Reset with other parts of pciehp. Unfortunately the locking turns out to be overzealous: reset_lock is held for the entire enumeration and de-enumeration of hotplugged devices, including driver binding and unbinding. Driver binding and unbinding acquires device_lock while the reset_lock of the ancestral hotplug port is held. A concurrent Secondary Bus Reset acquires the ancestral reset_lock while already holding the device_lock. The asymmetric locking order in the two code paths can lead to AB-BA deadlocks. Michael Haeuptle reports such deadlocks on simultaneous hot-removal and vfio release (the latter implies a Secondary Bus Reset): pciehp_ist() # down_read(reset_lock) pciehp_handle_presence_or_link_change() pciehp_disable_slot() __pciehp_disable_slot() remove_board() pciehp_unconfigure_device() pci_stop_and_remove_bus_device() pci_stop_bus_device() pci_stop_dev() device_release_driver() device_release_driver_internal() __device_driver_lock() # device_lock() SYS_munmap() vfio_device_fops_release() vfio_device_group_close() vfio_device_close() vfio_device_last_close() vfio_pci_core_close_device() vfio_pci_core_disable() # device_lock() __pci_reset_function_locked() pci_reset_bus_function() pci_dev_reset_slot_function() pci_reset_hotplug_slot() pciehp_reset_slot() # down_write(reset_lock) Ian May reports the same deadlock on simultaneous hot-removal and an AER-induced Secondary Bus Reset: aer_recover_work_func() pcie_do_recovery() aer_root_reset() pci_bus_error_reset() pci_slot_reset() pci_slot_lock() # device_lock() pci_reset_hotplug_slot() pciehp_reset_slot() # down_write(reset_lock) Fix by releasing the reset_lock during driver binding and unbinding, thereby splitting and shrinking the critical section. Driver binding and unbinding is protected by the device_lock() and thus serialized with a Secondary Bus Reset. There's no need to additionally protect it with the reset_lock. However, pciehp does not bind and unbind devices directly, but rather invokes PCI core functions which also perform certain enumeration and de-enumeration steps. The reset_lock's purpose is to protect slot registers, not enumeration and de-enumeration of hotplugged devices. That would arguably be the job of the PCI core, not the PCIe hotplug driver. After all, an AER-induced Secondary Bus Reset may as well happen during boot-time enumeration of the PCI hierarchy and there's no locking to prevent that either. Exempting de-enumeration from the reset_lock is relatively harmless: A concurrent Secondary Bus Reset may foil config space accesses such as PME interrupt disablement. But if the device is physically gone, those accesses are pointless anyway. If the device is physically present and only logically removed through an Attention Button press or the sysfs "power" attribute, PME interrupts as well as DMA cannot come through because pciehp_unconfigure_device() disables INTx and Bus Master bits. That's still protected by the reset_lock in the present commit. Exempting enumeration from the reset_lock also has limited impact: The exempted call to pci_bus_add_device() may perform device accesses through pcibios_bus_add_device() and pci_fixup_device() which are now no longer protected from a concurrent Secondary Bus Reset. Otherwise there should be no impact. In essence, the present commit seeks to fix the AB-BA deadlocks while still retaining a best-effort reset protection for enumeration and de-enumeration of hotplugged devices -- until a general solution is implemented in the PCI core. Link: https://lore.kernel.org/linux-pci/CS1PR8401MB0728FC6FDAB8A35C22BD90EC95F10@CS1PR8401MB0728.NAMPRD84.PROD.OUTLOOK.COM Link: https://lore.kernel.org/linux-pci/20200615143250.438252-1-ian.may@canonical.com Link: https://lore.kernel.org/linux-pci/ce878dab-c0c4-5bd0-a725-9805a075682d@amd.com Link: https://lore.kernel.org/linux-pci/ed831249-384a-6d35-0831-70af191e9bce@huawei.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: `5b3f7b7d06` ("PCI: pciehp: Avoid slot access during reset") Link: https://lore.kernel.org/r/fef2b2e9edf245c049a8c5b94743c0f74ff5008a.1681191902.git.lukas@wunner.de Reported-by: Michael Haeuptle <michael.haeuptle@hpe.com> Reported-by: Ian May <ian.may@canonical.com> Reported-by: Andrey Grodzovsky <andrey2805@gmail.com> Reported-by: Rahul Kumar <rahul.kumar1@amd.com> Reported-by: Jialin Zhang <zhangjialin11@huawei.com> Tested-by: Anatoli Antonovitch <Anatoli.Antonovitch@amd.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.19+ Cc: Dan Stein <dstein@hpe.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Alex Michon <amichon@kalrayinc.com> Cc: Xiongfeng Wang <wangxiongfeng2@huawei.com> Cc: Alex Williamson <alex.williamson@redhat.com> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Sathyanarayanan Kuppuswamy <sathyanarayanan.kuppuswamy@linux.intel.com>	2023-04-11 13:18:30 -05:00
Manivannan Sadhasivam	05f4646409	PCI: qcom: Expose link transition counts via debugfs Qualcomm PCIe controllers have debug registers in the MHI region that count PCIe link transitions. Expose them over debugfs to userspace to help debug the low power issues. Note that even though the registers are prefixed as PARF_, they don't live under the "parf" register region. The register naming is following the Qualcomm's internal documentation as like other registers. While at it, let's arrange the local variables in probe function to follow reverse XMAS tree order. Link: https://lore.kernel.org/r/20230316081117.14288-20-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:11 +02:00
Manivannan Sadhasivam	1f70939871	PCI: qcom: Rename qcom_pcie_config_sid_sm8250() to reflect IP version qcom_pcie_config_sid_sm8250() function no longer applies only to SM8250. So let's rename it to reflect the actual IP version and also move its definition to keep it sorted as per IP revisions. Link: https://lore.kernel.org/r/20230316081117.14288-15-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:11 +02:00
Manivannan Sadhasivam	656a08820e	PCI: qcom: Use macros for defining total no. of clocks & supplies To keep uniformity, let's use macros to define the total number of clocks and supplies in qcom_pcie_resources_{2_7_0/2_9_0} structs. Link: https://lore.kernel.org/r/20230316081117.14288-14-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	fb0eacb297	PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.4.0 All the resets are asserted and deasserted at the same time. So the bulk reset APIs can be used to handle them together. This simplifies the code a lot. It should be noted that there were delays in-between the reset asserts and deasserts. But going by the config used by other revisions, those delays are not really necessary. So a single delay after all asserts and one after deasserts is used. The total number of resets supported is 12 but only ipq4019 is using all of them. Link: https://lore.kernel.org/r/20230316081117.14288-13-manivannan.sadhasivam@linaro.org Tested-by: Sricharan Ramabadhran <quic_srichara@quicinc.com> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	157fecca35	PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.3.3 All the resets are asserted and deasserted at the same time. So the bulk reset APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/20230316081117.14288-12-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	b699ed9b03	PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 2.3.3 All the clocks are enabled and disabled at the same time. So the bulk clock APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/20230316081117.14288-11-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	5329bcc4a1	PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 2.3.2 All the clocks are enabled and disabled at the same time. So the bulk clock APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/20230316081117.14288-10-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	5d4ffe5ec5	PCI: qcom: Use bulk clock APIs for handling clocks for IP rev 1.0.0 All the clocks are enabled and disabled at the same time. So the bulk clock APIs can be used to handle them together. This simplifies the code a lot. Link: https://lore.kernel.org/r/20230316081117.14288-9-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	383215dd2f	PCI: qcom: Use bulk reset APIs for handling resets for IP rev 2.1.0 All the resets are asserted and deasserted at the same time. So the bulk reset APIs can be used to handle them together. This simplifies the code a lot. While at it, let's also move the qcom_pcie_resources_2_1_0 struct below qcom_pcie_resources_1_0_0 to keep it sorted. Link: https://lore.kernel.org/r/20230316081117.14288-8-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	94ebd232db	PCI: qcom: Use lower case for hex To maintain uniformity, let's use lower case for representing hexadecimal numbers. Link: https://lore.kernel.org/r/20230316081117.14288-7-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	17804668ca	PCI: qcom: Add missing macros for register fields Some of the registers are changed using hardcoded bitfields without macros. This provides no information on what the register setting is about. So add the macros to those fields for making the code more understandable. Link: https://lore.kernel.org/r/20230316081117.14288-6-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	57eddec8dc	PCI: qcom: Use bitfield definitions for register fields To maintain uniformity throughout the driver and also to make the code easier to read, let's make use of bitfield definitions for register fields. Link: https://lore.kernel.org/r/20230316081117.14288-5-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	769e49d87b	PCI: qcom: Sort and group registers and bitfield definitions Sorting the registers and their bit definitions will make it easier to add more definitions in the future and it also helps in maintenance. While at it, let's also group the registers and bit definitions separately as done in the pcie-qcom-ep driver. Link: https://lore.kernel.org/r/20230316081117.14288-4-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	39171b33f6	PCI: qcom: Remove PCIE20_ prefix from register definitions The PCIE part is redundant and 20 doesn't represent anything across the SoCs supported now. So let's get rid of the prefix. This involves adding the IP version suffix to one definition of PARF_SLV_ADDR_SPACE_SIZE that defines offset specific to that version. The other definition is generic for the rest of the versions. Also, the register PCIE20_LNK_CONTROL2_LINK_STATUS2 is not used anywhere, hence removed. Link: https://lore.kernel.org/r/20230316081117.14288-3-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-04-11 11:31:10 +02:00
Manivannan Sadhasivam	2542e16c39	PCI: qcom: Fix the incorrect register usage in v2.7.0 config Qcom PCIe IP version v2.7.0 and its derivatives don't contain the PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT register. Instead, they have the new PCIE20_PARF_AXI_MSTR_WR_ADDR_HALT_V2 register. So fix the incorrect register usage which is modifying a different register. Also in this IP version, this register change doesn't depend on MSI being enabled. So remove that check also. Link: https://lore.kernel.org/r/20230316081117.14288-2-manivannan.sadhasivam@linaro.org Fixes: `ed8cc3b1fc` ("PCI: qcom: Add support for SDM845 PCIe controller") Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: <stable@vger.kernel.org> # 5.6+	2023-04-11 11:31:10 +02:00
Linus Torvalds	c08cfd6716	cxl fixes for v6.3-rc6 - Fix several issues with region enumeration in RCH topologies that can trigger crashes on driver startup or shutdown. - Fix CXL DVSEC range register compatibility versus region enumeration that leads to startup crashes - Fix CDAT endiannes handling - Fix multiple buffer handling boundary conditions - Fix Data Object Exchange (DOE) workqueue usage vs CONFIG_DEBUG_OBJECTS warn splats -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZDJmGwAKCRDfioYZHlFs ZwYdAQC34Cjoky6YbE0+R2MkjiJh4ChgaJIdEuqwxr59hEaSLQEAgFjngoR0FXYc AxXPnMNBGAShk0jnm+44zaqfypWAVgw= =Fsrx -----END PGP SIGNATURE----- Merge tag 'cxl-fixes-6.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull compute express link (cxl) fixes from Dan Williams: "Several fixes for driver startup regressions that landed during the merge window as well as some older bugs. The regressions were due to a lack of testing with what the CXL specification calls Restricted CXL Host (RCH) topologies compared to the testing with Virtual Host (VH) CXL topologies. A VH topology is typical PCIe while RCH topologies map CXL endpoints as Root Complex Integrated endpoints. The impact is some driver crashes on startup. This merge window also added compatibility for range registers (the mechanism that CXL 1.1 defined for mapping memory) to treat them like HDM decoders (the mechanism that CXL 2.0 defined for mapping Host-managed Device Memory). That work collided with the new region enumeration code that was tested with CXL 2.0 setups, and fails with crashes at startup. Lastly, the DOE (Data Object Exchange) implementation for retrieving an ACPI-like data table from CXL devices is being reworked for v6.4. Several fixes fell out of that work that are suitable for v6.3. All of this has been in linux-next for a while, and all reported issues [1] have been addressed. Summary: - Fix several issues with region enumeration in RCH topologies that can trigger crashes on driver startup or shutdown. - Fix CXL DVSEC range register compatibility versus region enumeration that leads to startup crashes - Fix CDAT endiannes handling - Fix multiple buffer handling boundary conditions - Fix Data Object Exchange (DOE) workqueue usage vs CONFIG_DEBUG_OBJECTS warn splats" Link: http://lore.kernel.org/r/20230405075704.33de8121@canb.auug.org.au [1] * tag 'cxl-fixes-6.3-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: cxl/hdm: Extend DVSEC range register emulation for region enumeration cxl/hdm: Limit emulation to the number of range registers cxl/region: Move coherence tracking into cxl_region_attach() cxl/region: Fix region setup/teardown for RCDs cxl/port: Fix find_cxl_root() for RCDs and simplify it cxl/hdm: Skip emulation when driver manages mem_enable cxl/hdm: Fix double allocation of @cxlhdm PCI/DOE: Fix memory leak with CONFIG_DEBUG_OBJECTS=y PCI/DOE: Silence WARN splat with CONFIG_DEBUG_OBJECTS=y cxl/pci: Handle excessive CDAT length cxl/pci: Handle truncated CDAT entries cxl/pci: Handle truncated CDAT header cxl/pci: Fix CDAT retrieval on big endian	2023-04-09 09:45:46 -07:00
Bjorn Helgaas	774820b362	PCI/EDR: Add edr_handle_event() comments EDR documentation is a bit sketchy. Add a couple comments to edr_handle_event() about the devices involved. Link: https://lore.kernel.org/r/20230407215259.GA3825733@bhelgaas Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>	2023-04-07 17:39:53 -05:00
Kuppuswamy Sathyanarayanan	c441b1e03d	PCI/EDR: Clear Device Status after EDR error recovery During EDR recovery, the OS must clear error status of the port that triggered DPC even if firmware retains control of DPC and AER (see the implementation note in the PCI Firmware spec r3.3, sec 4.6.12). Prior to `068c29a248` ("PCI/ERR: Clear PCIe Device Status errors only if OS owns AER"), the port Device Status was cleared in this path: edr_handle_event dpc_process_error(dev) # "dev" triggered DPC pcie_do_recovery(dev, dpc_reset_link) dpc_reset_link # exit DPC pcie_clear_device_status(dev) # clear Device Status After `068c29a248`, pcie_do_recovery() no longer clears Device Status when firmware controls AER, so the error bit remains set even after recovery. Per the "Downstream Port Containment configuration control" bit in the returned _OSC Control Field (sec 4.5.1), the OS is allowed to clear error status until it evaluates _OST, so clear Device Status in edr_handle_event() if the error recovery was successful. [bhelgaas: commit log] Fixes: `068c29a248` ("PCI/ERR: Clear PCIe Device Status errors only if OS owns AER") Link: https://lore.kernel.org/r/20230315235449.1279209-1-sathyanarayanan.kuppuswamy@linux.intel.com Reported-by: Tsaur Erwin <erwin.tsaur@intel.com> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-04-07 16:43:18 -05:00
Rob Herring	30ba2d09ed	PCI: Fix use-after-free in pci_bus_release_domain_nr() Commit `c14f7ccc9f` ("PCI: Assign PCI domain IDs by ida_alloc()") introduced a use-after-free bug in the bus removal cleanup. The issue was found with kfence: [ 19.293351] BUG: KFENCE: use-after-free read in pci_bus_release_domain_nr+0x10/0x70 [ 19.302817] Use-after-free read at 0x000000007f3b80eb (in kfence-#115): [ 19.309677] pci_bus_release_domain_nr+0x10/0x70 [ 19.309691] dw_pcie_host_deinit+0x28/0x78 [ 19.309702] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194] [ 19.309734] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194] [ 19.309752] platform_probe+0x90/0xd8 ... [ 19.311457] kfence-#115: 0x00000000063a155a-0x00000000ba698da8, size=1072, cache=kmalloc-2k [ 19.311469] allocated by task 96 on cpu 10 at 19.279323s: [ 19.311562] __kmem_cache_alloc_node+0x260/0x278 [ 19.311571] kmalloc_trace+0x24/0x30 [ 19.311580] pci_alloc_bus+0x24/0xa0 [ 19.311590] pci_register_host_bridge+0x48/0x4b8 [ 19.311601] pci_scan_root_bus_bridge+0xc0/0xe8 [ 19.311613] pci_host_probe+0x18/0xc0 [ 19.311623] dw_pcie_host_init+0x2c0/0x568 [ 19.311630] tegra_pcie_dw_probe+0x610/0xb28 [pcie_tegra194] [ 19.311647] platform_probe+0x90/0xd8 ... [ 19.311782] freed by task 96 on cpu 10 at 19.285833s: [ 19.311799] release_pcibus_dev+0x30/0x40 [ 19.311808] device_release+0x30/0x90 [ 19.311814] kobject_put+0xa8/0x120 [ 19.311832] device_unregister+0x20/0x30 [ 19.311839] pci_remove_bus+0x78/0x88 [ 19.311850] pci_remove_root_bus+0x5c/0x98 [ 19.311860] dw_pcie_host_deinit+0x28/0x78 [ 19.311866] tegra_pcie_deinit_controller+0x1c/0x38 [pcie_tegra194] [ 19.311883] tegra_pcie_dw_probe+0x648/0xb28 [pcie_tegra194] [ 19.311900] platform_probe+0x90/0xd8 ... [ 19.313579] CPU: 10 PID: 96 Comm: kworker/u24:2 Not tainted 6.2.0 #4 [ 19.320171] Hardware name: /, BIOS 1.0-d7fb19b 08/10/2022 [ 19.325852] Workqueue: events_unbound deferred_probe_work_func The stack trace is a bit misleading as dw_pcie_host_deinit() doesn't directly call pci_bus_release_domain_nr(). The issue turns out to be in pci_remove_root_bus() which first calls pci_remove_bus() which frees the struct pci_bus when its struct device is released. Then pci_bus_release_domain_nr() is called and accesses the freed struct pci_bus. Reordering these fixes the issue. Fixes: `c14f7ccc9f` ("PCI: Assign PCI domain IDs by ida_alloc()") Link: https://lore.kernel.org/r/20230329123835.2724518-1-robh@kernel.org Link: https://lore.kernel.org/r/b529cb69-0602-9eed-fc02-2f068707a006@nvidia.com Reported-by: Jon Hunter <jonathanh@nvidia.com> Tested-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org # v6.2+ Cc: Pali Rohár <pali@kernel.org>	2023-04-06 18:20:59 -05:00
Cai Huoqing	5376ced54c	PCI/P2PDMA: Fix pci_p2pmem_find_many() kernel-doc Remove reference to pci_p2pmem_dma(), which has never existed. Link: https://lore.kernel.org/r/20230329024731.5604-1-cai.huoqing@linux.dev Signed-off-by: Cai Huoqing <cai.huoqing@linux.dev> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>	2023-04-06 16:37:51 -05:00
Andy Shevchenko	02992064bd	PCI: Make pci_bus_for_each_resource() index optional Refactor pci_bus_for_each_resource() in the same way as pci_dev_for_each_resource(). This allows the index to be hidden inside the implementation so the caller can omit it when it's not used otherwise. No functional changes intended. Link: https://lore.kernel.org/r/20230330162434.35055-6-andriy.shevchenko@linux.intel.com Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>	2023-04-05 15:10:09 -05:00
Mika Westerberg	09cc900632	PCI: Introduce pci_dev_for_each_resource() Instead of open-coding it everywhere introduce a tiny helper that can be used to iterate over each resource of a PCI device, and convert the most obvious users into it. While at it drop doubled empty line before pdev_sort_resources(). No functional changes intended. Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20230330162434.35055-4-andriy.shevchenko@linux.intel.com Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Krzysztof Wilczyński <kw@linux.com>	2023-04-04 10:43:52 -05:00
Lukas Wunner	abf04be0e7	PCI/DOE: Fix memory leak with CONFIG_DEBUG_OBJECTS=y After a pci_doe_task completes, its work_struct needs to be destroyed to avoid a memory leak with CONFIG_DEBUG_OBJECTS=y. Fixes: `9d24322e88` ("PCI/DOE: Add DOE mailbox support functions") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Cc: stable@vger.kernel.org # v6.0+ Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/775768b4912531c3b887d405fc51a50e465e1bf9.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-03 16:17:21 -07:00
Lukas Wunner	92dc899c3b	PCI/DOE: Silence WARN splat with CONFIG_DEBUG_OBJECTS=y Gregory Price reports a WARN splat with CONFIG_DEBUG_OBJECTS=y upon CXL probing because pci_doe_submit_task() invokes INIT_WORK() instead of INIT_WORK_ONSTACK() for a work_struct that was allocated on the stack. All callers of pci_doe_submit_task() allocate the work_struct on the stack, so replace INIT_WORK() with INIT_WORK_ONSTACK() as a backportable short-term fix. The long-term fix implemented by a subsequent commit is to move to a synchronous API which allocates the work_struct internally in the DOE library. Stacktrace for posterity: WARNING: CPU: 0 PID: 23 at lib/debugobjects.c:545 __debug_object_init.cold+0x18/0x183 CPU: 0 PID: 23 Comm: kworker/u2:1 Not tainted 6.1.0-0.rc1.20221019gitaae703b02f92.17.fc38.x86_64 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014 Call Trace: pci_doe_submit_task+0x5d/0xd0 pci_doe_discovery+0xb4/0x100 pcim_doe_create_mb+0x219/0x290 cxl_pci_probe+0x192/0x430 local_pci_probe+0x41/0x80 pci_device_probe+0xb3/0x220 really_probe+0xde/0x380 __driver_probe_device+0x78/0x170 driver_probe_device+0x1f/0x90 __driver_attach_async_helper+0x5c/0xe0 async_run_entry_fn+0x30/0x130 process_one_work+0x294/0x5b0 Fixes: `9d24322e88` ("PCI/DOE: Add DOE mailbox support functions") Link: https://lore.kernel.org/linux-cxl/Y1bOniJliOFszvIK@memverge.com/ Reported-by: Gregory Price <gregory.price@memverge.com> Tested-by: Ira Weiny <ira.weiny@intel.com> Tested-by: Gregory Price <gregory.price@memverge.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Gregory Price <gregory.price@memverge.com> Cc: stable@vger.kernel.org # v6.0+ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/67a9117f463ecdb38a2dbca6a20391ce2f1e7a06.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-03 16:17:03 -07:00
Greg Kroah-Hartman	cd8fe5b6db	Merge 6.3-rc5 into driver-core-next We need the fixes in here for testing, as well as the driver core changes for documentation updates to build on. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-04-03 09:33:30 +02:00
Linus Torvalds	916fc60988	pci-v6.3-fixes-1 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmQnBfIUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vxhcRAApsCfZxdl4AgJptBZv1SJYAKjNAua 2klMsmeJVH4JujzyiRhZ0XUXZ5+SBoKG1fJjd8WCLZmSd9f4+eaHWpuBW9bXVo6l z92E01yg/couGIgXSlcVYTYSApBqT2WhBKWdrQ+ObzwEgseJCnOhmcmbocfHGrmC pe1PhnEeC05RO+4oQ3B2lyiZr4uQb9LNYMJf2PtaJzqmTzFV6WDaIlRt/kElOnwO WKYiZ7SxC2LeuE/asUlVdSqQ73JZ+ZkAzwvJxjB9CCBa1zCIdq+lJuGl/D7Hkr2W QuMFqE7TahACc3fM8QOMQa8cQh0i2pXYRSP2lOizCy8AMgsKtNwO/iOsqMT8AyVS bhns6tBQozk80FakUxOxxNFk+rxBMu87Vb0DlvoGnpNkHuliv2oixFEo7CymMCbX wARaJgTEbVfP0bnHGYr0pdNomYYJgqgn5OQKRqLyDKkliMiHawQ7dY4PU2Wa89Cw IllvA6kuaoq73grbhyIysqOWbEeH3MrMbIe2RRdzBQERhK4rcbyoureZ/DcftvWF nriaomaeLQykXqh5ySjRXwjWjZTe1AKafuxSpm/zeXuUfM7LUo/+1D8PazAYOojV Sdx+x6YImI/qYfvmH1bdY/r2Rl1pRUgMM18RSqP/tyV3A8jjHcC4FYESwP0OQoGy Aw2LMCRTAcPkvVM= =GigZ -----END PGP SIGNATURE----- Merge tag 'pci-v6.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI fix from Bjorn Helgaas: - Fix DesignWare PORT_LINK_CONTROL setup, which was corrupted when the DT "snps,enable-cdm-check" property was present (Yoshihiro Shimoda) * tag 'pci-v6.3-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: PCI: dwc: Fix PORT_LINK_CONTROL update when CDM check enabled	2023-03-31 13:07:01 -07:00
Pali Rohár	6c6fa1f3f7	PCI: ixp4xx: Use PCI_CONF1_ADDRESS() macro Simplify pci-ixp4xx.c driver code and use new PCI_CONF1_ADDRESS() macro for accessing PCI config space. Link: https://lore.kernel.org/r/20220928122539.15116-1-pali@kernel.org Tested-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-03-24 16:48:32 +01:00
Sergio Paracuellos	50233e105a	PCI: mt7621: Use dev_info() to log PCIe card detection When there is no card plugged on a PCIe port a log reporting that the port will be disabled is flagged as an error (dev_err()). Since this is not an error at all, change the log level by using dev_info() instead. Link: https://lore.kernel.org/r/20230324073733.1596231-1-sergio.paracuellos@gmail.com Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-03-24 16:36:22 +01:00
H. Nikolaus Schaller	5f5ac460df	PCI: imx6: Install the fault handler only on compatible match commit `bb38919ec5` ("PCI: imx6: Add support for i.MX6 PCIe controller") added a fault hook to this driver in the probe function. So it was only installed if needed. commit `bde4a5a00e` ("PCI: imx6: Allow probe deferral by reset GPIO") moved it from probe to driver init which installs the hook unconditionally as soon as the driver is compiled into a kernel. When this driver is compiled as a module, the hook is not registered until after the driver has been matched with a .compatible and loaded. commit `415b6185c5` ("PCI: imx6: Fix config read timeout handling") extended the fault handling code. commit `2d8ed461db` ("PCI: imx6: Add support for i.MX8MQ") added some protection for non-ARM architectures, but this does not protect non-i.MX ARM architectures. Since fault handlers can be triggered on any architecture for different reasons, there is no guarantee that they will be triggered only for the assumed situation, leading to improper error handling (i.MX6-specific imx6q_pcie_abort_handler) on foreign systems. I had seen strange L3 imprecise external abort messages several times on OMAP4 and OMAP5 devices and couldn't make sense of them until I realized they were related to this unused imx6q driver because I had CONFIG_PCI_IMX6=y. Note that CONFIG_PCI_IMX6=y is useful for kernel binaries that are designed to run on different ARM SoC and be differentiated only by device tree binaries. So turning off CONFIG_PCI_IMX6 is not a solution. Therefore we check the compatible in the init function before registering the fault handler. Link: https://lore.kernel.org/r/e1bcfc3078c82b53aa9b78077a89955abe4ea009.1678380991.git.hns@goldelico.com Fixes: `bde4a5a00e` ("PCI: imx6: Allow probe deferral by reset GPIO") Fixes: `415b6185c5` ("PCI: imx6: Fix config read timeout handling") Fixes: `2d8ed461db` ("PCI: imx6: Add support for i.MX8MQ") Signed-off-by: H. Nikolaus Schaller <hns@goldelico.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Richard Zhu <hongxing.zhu@nxp.com>	2023-03-24 16:07:55 +01:00
Greg Kroah-Hartman	75cff725d9	driver core: bus: mark the struct bus_type for sysfs callbacks as constant struct bus_type should never be modified in a sysfs callback as there is nothing in the structure to modify, and frankly, the structure is almost never used in a sysfs callback, so mark it as constant to allow struct bus_type to be moved to read-only memory. Cc: "David S. Miller" <davem@davemloft.net> Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: "K. Y. Srinivasan" <kys@microsoft.com> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Bounine <alex.bou9@gmail.com> Cc: Alison Schofield <alison.schofield@intel.com> Cc: Ben Widawsky <bwidawsk@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Eric Dumazet <edumazet@google.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Hannes Reinecke <hare@suse.de> Cc: Harald Freudenberger <freude@linux.ibm.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Hu Haowen <src.res@email.cn> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Laurentiu Tudor <laurentiu.tudor@nxp.com> Cc: Matt Porter <mporter@kernel.crashing.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Stuart Yoder <stuyoder@gmail.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vishal Verma <vishal.l.verma@intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Acked-by: Ilya Dryomov <idryomov@gmail.com> # rbd Acked-by: Ira Weiny <ira.weiny@intel.com> # cxl Reviewed-by: Alex Shi <alexs@kernel.org> Acked-by: Iwona Winiarska <iwona.winiarska@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci Acked-by: Wei Liu <wei.liu@kernel.org> Acked-by: Martin K. Petersen <martin.petersen@oracle.com> # scsi Link: https://lore.kernel.org/r/20230313182918.1312597-23-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-03-23 13:20:40 +01:00
Lukas Wunner	fbaa38214c	cxl/pci: Fix CDAT retrieval on big endian The CDAT exposed in sysfs differs between little endian and big endian arches: On big endian, every 4 bytes are byte-swapped. PCI Configuration Space is little endian (PCI r3.0 sec 6.1). Accessors such as pci_read_config_dword() implicitly swap bytes on big endian. That way, the macros in include/uapi/linux/pci_regs.h work regardless of the arch's endianness. For an example of implicit byte-swapping, see ppc4xx_pciex_read_config(), which calls in_le32(), which uses lwbrx (Load Word Byte-Reverse Indexed). DOE Read/Write Data Mailbox Registers are unlike other registers in Configuration Space in that they contain or receive a 4 byte portion of an opaque byte stream (a "Data Object" per PCIe r6.0 sec 7.9.24.5f). They need to be copied to or from the request/response buffer verbatim. So amend pci_doe_send_req() and pci_doe_recv_resp() to undo the implicit byte-swapping. The CXL_DOE_TABLE_ACCESS_* and PCI_DOE_DATA_OBJECT_DISC_* macros assume implicit byte-swapping. Byte-swap requests after constructing them with those macros and byte-swap responses before parsing them. Change the request and response type to __le32 to avoid sparse warnings. Per a request from Jonathan, replace sizeof(u32) with sizeof(__le32) for consistency. Fixes: `c97006046c` ("cxl/port: Read CDAT table") Tested-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Cc: stable@vger.kernel.org # v6.0+ Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/3051114102f41d19df3debbee123129118fc5e6d.1678543498.git.lukas@wunner.de Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-03-21 12:27:08 -07:00
Yoshihiro Shimoda	cdce670991	PCI: dwc: Fix PORT_LINK_CONTROL update when CDM check enabled If CDM_CHECK is enabled (by the DT "snps,enable-cdm-check" property), 'val' is overwritten by PCIE_PL_CHK_REG_CONTROL_STATUS initialization. Commit `ec7b952f45` ("PCI: dwc: Always enable CDM check if "snps,enable-cdm-check" exists") did not account for further usage of 'val', so we wrote improper values to PCIE_PORT_LINK_CONTROL when the CDM check is enabled. Move the PCIE_PORT_LINK_CONTROL update to be completely after the PCIE_PL_CHK_REG_CONTROL_STATUS register initialization. [bhelgaas: commit log adapted from Serge's version] Fixes: `ec7b952f45` ("PCI: dwc: Always enable CDM check if "snps,enable-cdm-check" exists") Link: https://lore.kernel.org/r/20230310123510.675685-2-yoshihiro.shimoda.uh@renesas.com Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com>	2023-03-21 13:06:24 -05:00
Xiaowei Bao	be567c6cbc	PCI: layerscape: Add EP mode support for ls1028a Add PCIe EP mode support for ls1028a. Link: https://lore.kernel.org/r/20230209151050.233973-1-Frank.Li@nxp.com Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Alok Tiwari <alok.a.tiwari@oracle.com> Acked-by: Roy Zang <Roy.Zang@nxp.com>	2023-03-17 17:36:37 +01:00
Greg Kroah-Hartman	1aaba11da9	driver core: class: remove module * from class_create() The module pointer in class_create() never actually did anything, and it shouldn't have been requred to be set as a parameter even if it did something. So just remove it and fix up all callers of the function in the kernel tree at the same time. Cc: "Rafael J. Wysocki" <rafael@kernel.org> Acked-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20230313181843.1207845-4-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-03-17 15:16:33 +01:00
Niklas Schnelle	ab90950985	PCI: s390: Fix use-after-free of PCI resources with per-function hotplug On s390 PCI functions may be hotplugged individually even when they belong to a multi-function device. In particular on an SR-IOV device VFs may be removed and later re-added. In commit `a50297cf82` ("s390/pci: separate zbus creation from scanning") it was missed however that struct pci_bus and struct zpci_bus's resource list retained a reference to the PCI functions MMIO resources even though those resources are released and freed on hot-unplug. These stale resources may subsequently be claimed when the PCI function re-appears resulting in use-after-free. One idea of fixing this use-after-free in s390 specific code that was investigated was to simply keep resources around from the moment a PCI function first appeared until the whole virtual PCI bus created for a multi-function device disappears. The problem with this however is that due to the requirement of artificial MMIO addreesses (address cookies) extra logic is then needed to keep the address cookies compatible on re-plug. At the same time the MMIO resources semantically belong to the PCI function so tying their lifecycle to the function seems more logical. Instead a simpler approach is to remove the resources of an individually hot-unplugged PCI function from the PCI bus's resource list while keeping the resources of other PCI functions on the PCI bus untouched. This is done by introducing pci_bus_remove_resource() to remove an individual resource. Similarly the resource also needs to be removed from the struct zpci_bus's resource list. It turns out however, that there is really no need to add the MMIO resources to the struct zpci_bus's resource list at all and instead we can simply use the zpci_bar_struct's resource pointer directly. Fixes: `a50297cf82` ("s390/pci: separate zbus creation from scanning") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Matthew Rosato <mjrosato@linux.ibm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/20230306151014.60913-2-schnelle@linux.ibm.com Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>	2023-03-13 09:15:11 +01:00
Lukas Bulwahn	727de4c087	PCI: rcar: Avoid defines prefixed with CONFIG Defines prefixed with "CONFIG" should be limited to proper Kconfig options, that are introduced in a Kconfig file. In the R-car driver the bitmask to configure the SEND_ENABLE mode is named CONFIG_SEND_ENABLE. Rename this local definition to a more suitable name, containing the register bitfield name defined in the R-Car Gen3 rev. 2.30 user manual. No functional change. Link: https://lore.kernel.org/r/20230113084516.31888-1-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> [lpieralisi@kernel.org: Changed define naming and commit log] Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Geert Uytterhoeven <geert+renesas@glider.be>	2023-03-10 13:34:27 +01:00
Josh Triplett	3a2776e8a0	PCI: kirin: Select REGMAP_MMIO pcie-kirin uses regmaps, and needs to pull them in; otherwise, with CONFIG_PCIE_KIRIN=y and without CONFIG_REGMAP_MMIO pcie-kirin produces a linker failure looking for __devm_regmap_init_mmio_clk(). Fixes: `d19afe7be1` ("PCI: kirin: Use regmap for APB registers") Link: https://lore.kernel.org/r/04636141da1d6d592174eefb56760511468d035d.1668410580.git.josh@joshtriplett.org Signed-off-by: Josh Triplett <josh@joshtriplett.org> [lpieralisi@kernel.org: commit log and removed REGMAP select] Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: stable@vger.kernel.org # 5.16+	2023-03-10 11:22:24 +01:00
Linus Torvalds	4e9c542c7a	A set of updates for the interrupt susbsystem: - Prevent possible NULL pointer derefences in irq_data_get_affinity_mask() and irq_domain_create_hierarchy(). - Take the per device MSI lock before invoking code which relies on it being hold. - Make sure that MSI descriptors are unreferenced before freeing them. This was overlooked when the platform MSI code was converted to use core infrastructure and results in a fals positive warning. - Remove dead code in the MSI subsystem. - Clarify the documentation for pci_msix_free_irq(). - More kobj_type constification. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmQEVToTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYodc9EACg5HBOGsh5OV8pwnuEDqfThK/dOZ5k LJ8xQjGAx29JeNWu4gkkHaSDEGjZhwLiZlB6qUeH+LPQ1NgmAlLL3T2NxEOOWa6y z/xQv+1Ceu6XxazpCSFRWR/6w4Nyup92jhlsUIkmmsWkVvKH/pV6Uo+3ta0WagWg heb3vqts6J0AOJaMepF8azYGbwAPSIElNLI1UtiEuQYEKU55N8jLK20VJTL6lzJ2 FyRg/0ghNWDAaBdnv4cZCQ/MzoG5UkoU3f2cqhdSce5mqnq2fKRfgBjzllNgaRgA zxOxIR88QaKTMHIr+WKD1dyWxDQlotFbBOkmVW39XAa13rn42s4GIeW88VCjJGww RAm52SbC48cCIyNQlh4A6Vhb4vjPx2DndWbWnnVj5fWlUevdAPRxSlm4BjfxFxe+ LbuZCRRL1jjlC0fXmhVXTTxeE1/K7jarAZwRV7Nxhr3g0gT+Zv1jyaaW9rWuHq5U 3pS+xBl89LA/VYp9tv6jDfJlocmRwgrFbGX4UlfikqtObdTFqcH0FtmqisE61fZS n0194BMWNDfPSibSpDohf/CDPoHZ6pNxeuqkVDiisUJHPpIYOt8+lH+8//DgBL7a oi9zS0JazPIn2VM6NB4f/WXOYmS9GZq5+loiYEWb52AYtodKUKmOoWG0SUyy6XFr E7yJzemsUwrJVg== =jWot -----END PGP SIGNATURE----- Merge tag 'irq-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "A set of updates for the interrupt susbsystem: - Prevent possible NULL pointer derefences in irq_data_get_affinity_mask() and irq_domain_create_hierarchy() - Take the per device MSI lock before invoking code which relies on it being hold - Make sure that MSI descriptors are unreferenced before freeing them. This was overlooked when the platform MSI code was converted to use core infrastructure and results in a fals positive warning - Remove dead code in the MSI subsystem - Clarify the documentation for pci_msix_free_irq() - More kobj_type constification" * tag 'irq-urgent-2023-03-05' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/msi, platform-msi: Ensure that MSI descriptors are unreferenced genirq/msi: Drop dead domain name assignment irqdomain: Add missing NULL pointer check in irq_domain_create_hierarchy() genirq/irqdesc: Make kobj_type structures constant PCI/MSI: Clarify usage of pci_msix_free_irq() genirq/msi: Take the per-device MSI lock before validating the control structure genirq/ipi: Fix NULL pointer deref in irq_data_get_affinity_mask()	2023-03-05 11:19:16 -08:00
Linus Torvalds	84cc6674b7	virtio,vhost,vdpa: features, fixes device feature provisioning in ifcvf, mlx5 new SolidNET driver support for zoned block device in virtio blk numa support in virtio pmem VIRTIO_F_RING_RESET support in vhost-net more debugfs entries in mlx5 resume support in vdpa completion batching in virtio blk cleanup of dma api use in vdpa now simulating more features in vdpa-sim documentation, features, fixes all over the place Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmP0D98PHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRpV6IH/iecRgLMWWjp3n31IFdu31f/J4HpF7dczVjK qtV98eJ1N2pkgeJkdCfmB5XszfvFBeAurrS7++FTHiJhrRfR3Z+2ml/Qtvh5DEyP qxz6wOw6VVsi/txdUxM1wsxLeEmmzkmFdAmPM+FXeIjhWj76GOgy/4A3eaj6TgzV W8ShsBve/UZ5qMOC3XbIscvdOrudHJ18tH90Tiz3NZfH1fAs5E4uWbU6Mrz9DJVr canGvf4kAI9z8qram5HSgzPIXRJEYiF4q/eiStdtiiME8gL1mHLRZDNP1I1LeCAb q6Q6RCRKi3Ek+LGdH6u+nR1Swu03N2b/g+vgKtv30kJo06oZVzw= =EasV -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio updates from Michael Tsirkin: - device feature provisioning in ifcvf, mlx5 - new SolidNET driver - support for zoned block device in virtio blk - numa support in virtio pmem - VIRTIO_F_RING_RESET support in vhost-net - more debugfs entries in mlx5 - resume support in vdpa - completion batching in virtio blk - cleanup of dma api use in vdpa - now simulating more features in vdpa-sim - documentation, features, fixes all over the place * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (64 commits) vdpa/mlx5: support device features provisioning vdpa/mlx5: make MTU/STATUS presence conditional on feature bits vdpa: validate device feature provisioning against supported class vdpa: validate provisioned device features against specified attribute vdpa: conditionally read STATUS in config space vdpa: fix improper error message when adding vdpa dev vdpa/mlx5: Initialize CVQ iotlb spinlock vdpa/mlx5: Don't clear mr struct on destroy MR vdpa/mlx5: Directly assign memory key tools/virtio: enable to build with retpoline vringh: fix a typo in comments for vringh_kiov vhost-vdpa: print warning when vhost_vdpa_alloc_domain fails scsi: virtio_scsi: fix handling of kmalloc failure vdpa: Fix a couple of spelling mistakes in some messages vhost-net: support VIRTIO_F_RING_RESET vhost-scsi: convert sysfs snprintf and sprintf to sysfs_emit vdpa: mlx5: support per virtqueue dma device vdpa: set dma mask for vDPA device virtio-vdpa: support per vq dma device vdpa: introduce get_vq_dma_device() ...	2023-02-25 11:48:02 -08:00
Linus Torvalds	7c3dc440b1	cxl for v6.3 - CXL RAM region enumeration: instantiate 'struct cxl_region' objects for platform firmware created memory regions - CXL RAM region provisioning: complement the existing PMEM region creation support with RAM region support - "Soft Reservation" policy change: Online (memory hot-add) soft-reserved memory (EFI_MEMORY_SP) by default, but still allow for setting aside such memory for dedicated access via device-dax. - CXL Events and Interrupts: Takeover CXL event handling from platform-firmware (ACPI calls this CXL Memory Error Reporting) and export CXL Events via Linux Trace Events. - Convey CXL _OSC results to drivers: Similar to PCI, let the CXL subsystem interrogate the result of CXL _OSC negotiation. - Emulate CXL DVSEC Range Registers as "decoders": Allow for first-generation devices that pre-date the definition of the CXL HDM Decoder Capability to translate the CXL DVSEC Range Registers into 'struct cxl_decoder' objects. - Set timestamp: Per spec, set the device timestamp in case of hotplug, or if platform-firwmare failed to set it. - General fixups: linux-next build issues, non-urgent fixes for pre-production hardware, unit test fixes, spelling and debug message improvements. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCY/WYcgAKCRDfioYZHlFs Z6m3APkBUtiEEm1o8ikdu5llUS1OTLBwqjJDwGMTyf8X/WDXhgD+J2mLsCgARS7X 5IS0RAtefutrW5sQpUucPM7QiLuraAY= =kOXC -----END PGP SIGNATURE----- Merge tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull Compute Express Link (CXL) updates from Dan Williams: "To date Linux has been dependent on platform-firmware to map CXL RAM regions and handle events / errors from devices. With this update we can now parse / update the CXL memory layout, and report events / errors from devices. This is a precursor for the CXL subsystem to handle the end-to-end "RAS" flow for CXL memory. i.e. the flow that for DDR-attached-DRAM is handled by the EDAC driver where it maps system physical address events to a field-replaceable-unit (FRU / endpoint device). In general, CXL has the potential to standardize what has historically been a pile of memory-controller-specific error handling logic. Another change of note is the default policy for handling RAM-backed device-dax instances. Previously the default access mode was "device", mmap(2) a device special file to access memory. The new default is "kmem" where the address range is assigned to the core-mm via add_memory_driver_managed(). This saves typical users from wondering why their platform memory is not visible via free(1) and stuck behind a device-file. At the same time it allows expert users to deploy policy to, for example, get dedicated access to high performance memory, or hide low performance memory from general purpose kernel allocations. This affects not only CXL, but also systems with high-bandwidth-memory that platform-firmware tags with the EFI_MEMORY_SP (special purpose) designation. Summary: - CXL RAM region enumeration: instantiate 'struct cxl_region' objects for platform firmware created memory regions - CXL RAM region provisioning: complement the existing PMEM region creation support with RAM region support - "Soft Reservation" policy change: Online (memory hot-add) soft-reserved memory (EFI_MEMORY_SP) by default, but still allow for setting aside such memory for dedicated access via device-dax. - CXL Events and Interrupts: Takeover CXL event handling from platform-firmware (ACPI calls this CXL Memory Error Reporting) and export CXL Events via Linux Trace Events. - Convey CXL _OSC results to drivers: Similar to PCI, let the CXL subsystem interrogate the result of CXL _OSC negotiation. - Emulate CXL DVSEC Range Registers as "decoders": Allow for first-generation devices that pre-date the definition of the CXL HDM Decoder Capability to translate the CXL DVSEC Range Registers into 'struct cxl_decoder' objects. - Set timestamp: Per spec, set the device timestamp in case of hotplug, or if platform-firwmare failed to set it. - General fixups: linux-next build issues, non-urgent fixes for pre-production hardware, unit test fixes, spelling and debug message improvements" * tag 'cxl-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (66 commits) dax/kmem: Fix leak of memory-hotplug resources cxl/mem: Add kdoc param for event log driver state cxl/trace: Add serial number to trace points cxl/trace: Add host output to trace points cxl/trace: Standardize device information output cxl/pci: Remove locked check for dvsec_range_allowed() cxl/hdm: Add emulation when HDM decoders are not committed cxl/hdm: Create emulated cxl_hdm for devices that do not have HDM decoders cxl/hdm: Emulate HDM decoder from DVSEC range registers cxl/pci: Refactor cxl_hdm_decode_init() cxl/port: Export cxl_dvsec_rr_decode() to cxl_port cxl/pci: Break out range register decoding from cxl_hdm_decode_init() cxl: add RAS status unmasking for CXL cxl: remove unnecessary calling of pci_enable_pcie_error_reporting() dax/hmem: build hmem device support as module if possible dax: cxl: add CXL_REGION dependency cxl: avoid returning uninitialized error code cxl/pmem: Fix nvdimm registration races cxl/mem: Fix UAPI command comment cxl/uapi: Tag commands from cxl_query_cmd() ...	2023-02-25 09:19:23 -08:00
Linus Torvalds	8ff99ad04c	phy-for-6.3 - Core support - New devm_of_phy_optional_get() API with users and conversion - New support: - Mediatek MT7986 tphy support - Qualcomm SM8550 UFS, PCIe, combo phy support, SM6115 / SM4250 USB3 phy support, SM6350 combo phy support, SM6125 UFS PHY support amd SM8350 & SM8450 combo phy support - Qualcomm SNPS eUSB2 eUSB2 repeater drivers - Allwinner F1C100s USB PHY support - Tegra xusb support for Tegra234 - Updates: - Yaml conversion for Qualcomm pcie2 phy and usb-hsic-phy - G4 mode support in Qualcomm UFS phy and support for various SoCs - Yaml conversion for Meson usb2 phy - TI Type C support for usb phy for j721 - Yaml conversion for Tegra xusb binding -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmP4vg0ACgkQfBQHDyUj g0dJYxAA0DL9X6IFeK8U5blp/ZWmXBqbhRmnsf0KGsSRJMSgHkPruhOWfTCAvXnc 2+xp8wiE0MdLI7xseJWZi7Q2r10f5LtU55rzbL3mI7MWd/g2WTlKiXCDrpa4fY/Z pxo+892vUJh3+I2+Sjf0JnIY89MV/sqSLXsFeKDtvp7J9lMjA98TV6m+YDVTXn22 SW3hjaB8ochSQV1HEMdEJWsrZc3lmszLdQM+qz3PafyQRbhc1A98Vkf0X/sWR/Ot p0FCXlNnY3O272dnrU0V5yv7wwWqjVDN5+Q3vk3AbSlo9ERLVwchayUzxi8EIS7t cPmxhsyMoEmsSIPx4z47vLt1NQoqiaKNM7XCrn13Z0fE9fbTW8Trx8VBXcIUsE98 hT6IxrjRFGJOta8koOssBqSjuwP4QBIZiwXL2YEujj3hGqyRefOCN5XBek7dVyDe ctwJsIKBCG8Wh87dFldYLrJgQKR9svZXDjxVADpYMUpPM2v02DCWhUyM50ODowZf Yl7bP8dXtn2UBIybbhNTZg29PbrATk73tcr73GZeX8JTOK2vpsZ3+fUsdxPYzed3 lF2vw361E2ry1DtgmH7XMXevDFvKJ/aks5FIAKebc1tlAPPGYVIkBqyQprAQmlS3 tDQ+6+jQLAr14iSaVQd9MC3obNqbJYHf1WEU3rKtDy3MB0flbqo= =2g27 -----END PGP SIGNATURE----- Merge tag 'phy-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy updates from Vinod Koul: "This features a bunch of new device support, a couple of new drivers, yaml conversion and updates of a few drivers. Core support: - New devm_of_phy_optional_get() API with users and conversion New hardware support: - Mediatek MT7986 phy support - Qualcomm SM8550 UFS, PCIe, combo phy support, SM6115 / SM4250 USB3 phy support, SM6350 combo phy support, SM6125 UFS PHY support amd SM8350 & SM8450 combo phy support - Qualcomm SNPS eUSB2 eUSB2 repeater drivers - Allwinner F1C100s USB PHY support - Tegra xusb support for Tegra234 Updates: - Yaml conversion for Qualcomm pcie2 phy and usb-hsic-phy - G4 mode support in Qualcomm UFS phy and support for various SoCs - Yaml conversion for Meson usb2 phy - TI Type C support for usb phy for j721 - Yaml conversion for Tegra xusb binding" * tag 'phy-for-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (106 commits) phy: qcom: phy-qcom-snps-eusb2: Add support for eUSB2 repeater phy: qcom: Add QCOM SNPS eUSB2 repeater driver dt-bindings: phy: qcom,snps-eusb2-phy: Add phys property for the repeater dt-bindings: phy: Add qcom,snps-eusb2-repeater schema file dt-bindings: phy: amlogic,g12a-usb3-pcie-phy: add missing optional phy-supply property phy: rockchip-typec: Fix unsigned comparison with less than zero phy: rockchip-typec: fix tcphy_get_mode error case phy: qcom: snps-eusb2: Add missing headers phy: qcom-qmp-combo: Add support for SM8550 phy: qcom-qmp: Add v6 DP register offsets phy: qcom-qmp: pcs-usb: Add v6 register offsets dt-bindings: phy: qcom,sc8280xp-qmp-usb43dp: Document SM8550 compatible phy: qcom: Add QCOM SNPS eUSB2 driver dt-bindings: phy: Add qcom,snps-eusb2-phy schema file phy: qcom-qmp-pcie: Add support for SM8550 g3x2 and g4x2 PCIEs phy: qcom-qmp: qserdes-lane-shared: Add v6 register offsets phy: qcom-qmp: qserdes-txrx: Add v6.20 register offsets phy: qcom-qmp: pcs-pcie: Add v6.20 register offsets phy: qcom-qmp: pcs-pcie: Add v6 register offsets phy: qcom-qmp: pcs: Add v6.20 register offsets ...	2023-02-24 17:22:11 -08:00
Linus Torvalds	90ddb3f034	pci-v6.3-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmP2dbsUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vzBDg//aW2IeJYku5ENXwwnCQjBlyjBGOOZ 456KGpFt/ky0N9Jp0ZS3nQSa5YN7q+L8XY48gu6I7s1hXly8iLZKLrJN++S//k55 BadXu7mDUyVoY74LYvBe0nlXuwJul2qnq9IJLufRucrn1yoyqApAh39IRdCzi4U8 mP+wad7sQA0Si4bpf80uwn6Yq8SrDoO0mtmO/dZSXJooM2t2SnDXEL/fxMwTNDA4 XsVSP9FrbPmcTLo8mkDa8Dy7JKbL6KQJF9yDlmYzuA2spQpTf+YLLfsNnmE+850h WTtfCjVaYtlik7i9qTB+VcN1CsGVepYKK3H5the16Aeql2Fu+Ji5KSt74C220Yi9 ZSDA93d/EfGc5egKyBdUUMFgqhe46srRUAoWcMrx2T4ARGuOm5EYCa9C8C7dFmO0 j6f9MYL3j2Sw3FROEKViRVOFfbIfVW1TXIo3x0fE0ud3xkg73eKp/++X8QeTMjox 2ArY2AWPNQpUI1oMlKxlSEd5XjFf7n/hHDtFqj9bIuJzt0/8wXQf0jCYTjhpGkRB pmO+lColK6lp+bg8aWRRkiwN73xGdQhKaeXLo0Iq4T6xr0Lb3XoskHZvt6NIGe/A ds5/uwtErq6kCf2G9YG1xfh+G1bimbjWwsHCNfSNXzTsWGDFTCb8tvqF90m+7+yl bllxTXA6PO312Tw= =/y4d -----END PGP SIGNATURE----- Merge tag 'pci-v6.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Rework portdrv shutdown so it disables interrupts but doesn't disable bus mastering, which leads to hangs on Loongson LS7A - Add mechanism to prevent Max_Read_Request_Size (MRRS) increases, again to avoid hardware issues on Loongson LS7A (and likely other devices based on DesignWare IP) - Ignore devices with a firmware (DT or ACPI) node that says the device is disabled Resource management: - Distribute spare resources to unconfigured hotplug bridges at boot-time (not just when hot-adding such a bridge), which makes hot-adding devices to docks work better. Tried this in v6.1 but had to revert for regressions, so try again - Fix root bus issue that dropped resources that happened to end at 0, e.g., [bus 00] PCI device hotplug: - Remove device locking when marking device as disconnected so this doesn't have to wait for concurrent driver bind/unbind to complete - Quirk more Qualcomm bridges that don't fully implement the PCIe Slot Status 'Command Completed' bit Power management: - Account for _S0W of the target bridge in acpi_pci_bridge_d3() so we don't miss hot-add notifications for USB4 docks, Thunderbolt, etc Reset: - Observe delay after reset, e.g., resuming from system sleep, regardless of whether a bridge can suspend to D3cold at runtime - Wait for secondary bus to become ready after a bridge reset Virtualization: - Avoid FLR on some AMD FCH AHCI adapters where it doesn't work - Allow independent IOMMU groups for some Wangxun NICs that prevent peer-to-peer transactions but don't advertise an ACS Capability Error handling: - Configure End-to-End-CRC (ECRC) only if Linux owns the AER Capability - Remove redundant Device Control Error Reporting Enable in the AER service driver since this is already done for all devices during enumeration ASPM: - Add pci_enable_link_state() interface to allow drivers to enable ASPM link state Endpoint framework: - Move dra7xx and tegra194 linkup processing from hard IRQ to threaded IRQ handler - Add a separate lock for endpoint controller list of endpoint function drivers to prevent deadlock in callbacks - Pass events from endpoint controller to endpoint function drivers via callbacks instead of notifiers Synopsys DesignWare eDMA controller driver (acked by Vinod): - Fix CPU vs PCI address issues - Fix source vs destination address issues - Fix issues with interleaved transfer semantics - Fix channel count initialization issue (issue still exists in several other drivers) - Clean up and improve debugfs usage so it will work on platforms with several eDMA devices Baikal T-1 PCIe controller driver: - Set a 64-bit DMA mask Freescale i.MX6 PCIe controller driver: - Add i.MX8MM, i.MX8MQ, i.MX8MP endpoint mode DT binding and driver support Intel VMD host bridge driver: - Add quirk to configure PCIe ASPM and LTR. This is normally done by BIOS, and will be for future products Marvell MVEBU PCIe controller driver: - Mark this driver as broken in Kconfig since bugs prevent its daily usage MediaTek MT7621 PCIe controller driver: - Delay PHY port initialization to improve boot reliability for ZBT WE1326, ZBT WF3526-P, and some Netgear models Qualcomm PCIe controller driver: - Add MSM8998 DT compatible string - Unify MSM8996 and MSM8998 clock orderings - Add SM8350 DT binding and driver support - Add IPQ8074 Gen3 DT binding and driver support - Correct qcom,perst-regs in DT binding - Add qcom_pcie_host_deinit() so the PHY is powered off and regulators and clocks are disabled on late host-init errors Socionext UniPhier Pro5 controller driver: - Clean up uniphier-ep reg, clocks, resets, and their names in DT binding Synopsys DesignWare PCIe controller driver: - Restrict coherent DMA mask to 32 bits for MSI, but allow controller drivers to set 64-bit streaming DMA mask - Add eDMA engine support in both Root Port and Endpoint controllers Miscellaneous: - Remove MODULE_LICENSE from boolean drivers so they don't look like modules so modprobe can complain about them" * tag 'pci-v6.3-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (86 commits) PCI: dwc: Add Root Port and Endpoint controller eDMA engine support PCI: bt1: Set 64-bit DMA mask PCI: dwc: Restrict only coherent DMA mask for MSI address allocation dmaengine: dw-edma: Prepare dw_edma_probe() for builtin callers dmaengine: dw-edma: Depend on DW_EDMA instead of selecting it dmaengine: dw-edma: Add mem-mapped LL-entries support PCI: Remove MODULE_LICENSE so boolean drivers don't look like modules PCI: hv: Drop duplicate PCI_MSI dependency PCI/P2PDMA: Annotate RCU dereference PCI/sysfs: Constify struct kobj_type pci_slot_ktype PCI: hotplug: Allow marking devices as disconnected during bind/unbind PCI: pciehp: Add Qualcomm quirk for Command Completed erratum PCI: qcom: Add IPQ8074 Gen3 port support dt-bindings: PCI: qcom: Add IPQ8074 Gen3 port dt-bindings: PCI: qcom: Sort compatibles alphabetically PCI: qcom: Fix host-init error handling PCI: qcom: Add SM8350 support dt-bindings: PCI: qcom: Add SM8350 dt-bindings: PCI: qcom-ep: Correct qcom,perst-regs dt-bindings: PCI: qcom: Unify MSM8996 and MSM8998 clock order ...	2023-02-24 16:51:40 -08:00
Linus Torvalds	a93e884edf	Driver core changes for 6.3-rc1 Here is the large set of driver core changes for 6.3-rc1. There's a lot of changes this development cycle, most of the work falls into two different categories: - fw_devlink fixes and updates. This has gone through numerous review cycles and lots of review and testing by lots of different devices. Hopefully all should be good now, and Saravana will be keeping a watch for any potential regression on odd embedded systems. - driver core changes to work to make struct bus_type able to be moved into read-only memory (i.e. const) The recent work with Rust has pointed out a number of areas in the driver core where we are passing around and working with structures that really do not have to be dynamic at all, and they should be able to be read-only making things safer overall. This is the contuation of that work (started last release with kobject changes) in moving struct bus_type to be constant. We didn't quite make it for this release, but the remaining patches will be finished up for the release after this one, but the groundwork has been laid for this effort. Other than that we have in here: - debugfs memory leak fixes in some subsystems - error path cleanups and fixes for some never-able-to-be-hit codepaths. - cacheinfo rework and fixes - Other tiny fixes, full details are in the shortlog All of these have been in linux-next for a while with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCY/ipdg8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ynL3gCgwzbcWu0So3piZyLiJKxsVo9C2EsAn3sZ9gN6 6oeFOjD3JDju3cQsfGgd =Su6W -----END PGP SIGNATURE----- Merge tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the large set of driver core changes for 6.3-rc1. There's a lot of changes this development cycle, most of the work falls into two different categories: - fw_devlink fixes and updates. This has gone through numerous review cycles and lots of review and testing by lots of different devices. Hopefully all should be good now, and Saravana will be keeping a watch for any potential regression on odd embedded systems. - driver core changes to work to make struct bus_type able to be moved into read-only memory (i.e. const) The recent work with Rust has pointed out a number of areas in the driver core where we are passing around and working with structures that really do not have to be dynamic at all, and they should be able to be read-only making things safer overall. This is the contuation of that work (started last release with kobject changes) in moving struct bus_type to be constant. We didn't quite make it for this release, but the remaining patches will be finished up for the release after this one, but the groundwork has been laid for this effort. Other than that we have in here: - debugfs memory leak fixes in some subsystems - error path cleanups and fixes for some never-able-to-be-hit codepaths. - cacheinfo rework and fixes - Other tiny fixes, full details are in the shortlog All of these have been in linux-next for a while with no reported problems" [ Geert Uytterhoeven points out that that last sentence isn't true, and that there's a pending report that has a fix that is queued up - Linus ] * tag 'driver-core-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (124 commits) debugfs: drop inline constant formatting for ERR_PTR(-ERROR) OPP: fix error checking in opp_migrate_dentry() debugfs: update comment of debugfs_rename() i3c: fix device.h kernel-doc warnings dma-mapping: no need to pass a bus_type into get_arch_dma_ops() driver core: class: move EXPORT_SYMBOL_GPL() lines to the correct place Revert "driver core: add error handling for devtmpfs_create_node()" Revert "devtmpfs: add debug info to handle()" Revert "devtmpfs: remove return value of devtmpfs_delete_node()" driver core: cpu: don't hand-override the uevent bus_type callback. devtmpfs: remove return value of devtmpfs_delete_node() devtmpfs: add debug info to handle() driver core: add error handling for devtmpfs_create_node() driver core: bus: update my copyright notice driver core: bus: add bus_get_dev_root() function driver core: bus: constify bus_unregister() driver core: bus: constify some internal functions driver core: bus: constify bus_get_kset() driver core: bus: constify bus_register/unregister_notifier() driver core: remove private pointer from struct bus_type ...	2023-02-24 12:58:55 -08:00
Bjorn Helgaas	3eb5d0f26f	Merge branch 'pci/misc' - Drop bogus kernel-doc marker in pci_endpoint_test.c (Randy Dunlap) - Fix epf_ntb_mw_bar_clear() kernel-doc (Yang Yingliang) - Constify struct kobj_type pci_slot_ktype (Thomas Weißschuh) * pci/misc: PCI: hv: Drop duplicate PCI_MSI dependency PCI/sysfs: Constify struct kobj_type pci_slot_ktype PCI: endpoint: pci-epf-vntb: Add epf_ntb_mw_bar_clear() num_mws kernel-doc misc: pci_endpoint_test: Drop initial kernel-doc marker	2023-02-22 13:47:32 -06:00
Bjorn Helgaas	90fb1a3652	Merge branch 'pci/controller/vmd' - Add pci_enable_link_state() to allow drivers to enable ASPM link state (Michael Bottini) - Add quirk to enable all ASPM link states and program LTR for devices below VMD (David E. Box) * pci/controller/vmd: PCI: vmd: Add quirk to configure PCIe ASPM and LTR PCI: vmd: Create feature grouping for client products PCI: vmd: Use PCI_VDEVICE in device list PCI/ASPM: Add pci_enable_link_state()	2023-02-22 13:47:31 -06:00
Bjorn Helgaas	0784d32c3d	Merge branch 'pci/controller/switchtec' - Return -EFAULT instead of unrelated codes for copy_to_user() errors (Bjorn Helgaas) * pci/controller/switchtec: PCI: switchtec: Return -EFAULT for copy_to_user() errors PCI: switchtec: Simplify switchtec_dma_mrpc_isr()	2023-02-22 13:47:31 -06:00
Bjorn Helgaas	b237474a90	Merge branch 'pci/controller/qcom' - Add DT compatible for qcom MSM8998 (Krzysztof Kozlowski) - Unify qcom MSM8996 and MSM8998 clock orderings (Krzysztof Kozlowski) - Correct qcom,perst-regs (Krzysztof Kozlowski) - Add qcom SM8350 DT binding and driver support (Dmitry Baryshkov) - Add qcom_pcie_host_deinit() so the PHY is powered off and regulators and clocks are disabled on late host-init errors (Johan Hovold) - Add IPQ8074 Gen3 port DT binding and driver support (the Gen2 port was already supported) (Robert Marko) * pci/controller/qcom: PCI: qcom: Add IPQ8074 Gen3 port support dt-bindings: PCI: qcom: Add IPQ8074 Gen3 port dt-bindings: PCI: qcom: Sort compatibles alphabetically PCI: qcom: Fix host-init error handling PCI: qcom: Add SM8350 support dt-bindings: PCI: qcom: Add SM8350 dt-bindings: PCI: qcom-ep: Correct qcom,perst-regs dt-bindings: PCI: qcom: Unify MSM8996 and MSM8998 clock order dt-bindings: PCI: qcom: Add MSM8998 specific compatible dt-bindings: PCI: qcom: Add oneOf to compatible match	2023-02-22 13:47:30 -06:00
Bjorn Helgaas	7cfd342bd1	Merge branch 'pci/controller/mvebu' - Mark mvebu driver as broken (Pali Rohár) * pci/controller/mvebu: PCI: mvebu: Mark driver as BROKEN	2023-02-22 13:47:30 -06:00
Bjorn Helgaas	181a60a0ee	Merge branch 'pci/controller/mt7621' - Delay PHY initialization to make boots reliable for ZBT WE1326 and ZBT WF3526-P and some Netgear models (Sergio Paracuellos) * pci/controller/mt7621: PCI: mt7621: Delay phy ports initialization	2023-02-22 13:47:29 -06:00
Bjorn Helgaas	a9cd360245	Merge branch 'pci/controller/imx6' - Add i.MX8MM, i.MX8MQ, i.MX8MP endpoint mode DT binding and driver support (Richard Zhu) * pci/controller/imx6: PCI: imx6: Add i.MX8MP PCIe EP support PCI: imx6: Add i.MX8MM PCIe EP support PCI: imx6: Add i.MX8MQ PCIe EP support PCI: imx6: Add i.MX PCIe EP mode support misc: pci_endpoint_test: Add i.MX8 PCIe EP device support dt-bindings: imx6q-pcie: Add i.MX8MP PCIe EP mode compatible string dt-bindings: imx6q-pcie: Add i.MX8MQ PCIe EP mode compatible string dt-bindings: imx6q-pcie: Add i.MX8MM PCIe EP mode compatible string	2023-02-22 13:47:29 -06:00
Bjorn Helgaas	5256d49380	Merge branch 'pci/controller/dwc' - Release previously-requested DW eDMA IRQs if request_irq() fails (Serge Semin) - Convert DW eDMA linked-list (ll) and data target (dt) from CPU-relative addresses to PCI bus addresses (Serge Semin) - Fix missing src/dst address for interleaved transfers (Serge Semin) - Enforce the DW eDMA restriction that interleaved transfers must increment src and dst addresses (Serge Semin) - Fix some invalid interleaved transfer semantics (Serge Semin) - Convert CPU-relative addresses to PCI bus addresses for eDMA engine (Serge Semin) - Drop chancnt initialization from dw-edma-core, since it is managed by the dmaengine core, e.g., in dma_async_device_channel_register() (Serge Semin) - Clean up bogus casting of debugfs_entries.reg addresses (Serge Semin) - Ignore debugfs file/directory creation errors (Serge Semin) - Allocate debugfs entries from the heap to prepare for multi-eDMA platforms (Serge Semin) - Simplify and rework register accessors to remove another obstacle to multi-eDMA platforms (Serge Semin) - Consolidate eDMA read/write channels in a single dma_device to simplify, better reflect the hardware design, and avoid a debugfs complaint (Serge Semin) - Move eDMA-specific debugfs nodes into existing dmaengine subdirectory (Serge Semin) - Fix a readq_ch() truncation from 64 to 32 bits (Serge Semin) - Use existing readq()/writeq rather than hand-coding new ones (Serge Semin) - Drop unnecessary data target region allocation in favor of existing dw_edma_chip members (Serge Semin) - Use parent device in eDMA controller name to prepare for multi-eDMA platforms (Serge Semin) - In addition to the existing MMIO accessors for linked list entries, add support for ioremapped entries for use by eDMA in Root Ports or local Endpoints (Serge Semin) - Convert DW_EDMA_PCIE so it depends on DW_EDMA instead of selecting it (Serge Semin) - Allow DWC drivers to set streaming DMA masks larger than 32 bits; previously both streaming and coherent DMA were limited to 32 bits because some PCI devices only support coherent 32-bit DMA for MSI (Serge Semin) - Set 64-bit streaming and coherent DMA mask for the bt1 driver (Serge Semin) - Add DW Root Port and Endpoint controller support for eDMA (Serge Semin) * pci/controller/dwc: PCI: dwc: Add Root Port and Endpoint controller eDMA engine support PCI: bt1: Set 64-bit DMA mask PCI: dwc: Restrict only coherent DMA mask for MSI address allocation dmaengine: dw-edma: Prepare dw_edma_probe() for builtin callers dmaengine: dw-edma: Depend on DW_EDMA instead of selecting it dmaengine: dw-edma: Add mem-mapped LL-entries support dmaengine: dw-edma: Skip cleanup procedure if no private data found dmaengine: dw-edma: Replace chip ID number with device name dmaengine: dw-edma: Drop DT-region allocation dmaengine: dw-edma: Use non-atomic io-64 methods dmaengine: dw-edma: Fix readq_ch() return value truncation dmaengine: dw-edma: Use DMA engine device debugfs subdirectory dmaengine: dw-edma: Join read/write channels into a single device dmaengine: dw-edma: Move eDMA data pointer to debugfs node descriptor dmaengine: dw-edma: Simplify debugfs context CSRs init procedure dmaengine: dw-edma: Rename debugfs dentry variables to 'dent' dmaengine: dw-edma: Convert debugfs descs to being heap-allocated dmaengine: dw-edma: Add dw_edma prefix to debugfs nodes descriptor dmaengine: dw-edma: Stop checking debugfs_create_*() return value dmaengine: dw-edma: Drop unnecessary debugfs reg casts dmaengine: dw-edma: Drop chancnt initialization dmaengine: dw-edma: Add PCI bus address getter to the remote EP glue driver dmaengine: dw-edma: Add CPU to PCI bus address translation dmaengine: dw-edma: Fix invalid interleaved xfers semantics dmaengine: dw-edma: Don't permit non-inc interleaved xfers dmaengine: dw-edma: Fix missing src/dst address of interleaved xfers dmaengine: dw-edma: Convert ll/dt phys address to PCI bus/DMA address dmaengine: dw-edma: Release requested IRQs on failure dmaengine: Fix dma_slave_config.dst_addr description	2023-02-22 13:47:29 -06:00
Bjorn Helgaas	33abd97c34	Merge branch 'pci/endpoint' - Convert dra7xx to threaded IRQ handler (Manivannan Sadhasivam) - Move tegra194 dw_pcie_ep_linkup() to threaded IRQ handler (Manivannan Sadhasivam) - Add a separate lock for the endpoint pci_epf list to avoid deadlock while running callbacks (Manivannan Sadhasivam) - Use callbacks instead of notifier chains to signal events from EPC to EPF drivers (Manivannan Sadhasivam) - Use link_up() callback in place of LINK_UP notifier (Manivannan Sadhasivam) * pci/endpoint: PCI: endpoint: Use link_up() callback in place of LINK_UP notifier PCI: endpoint: Use callback mechanism for passing events from EPC to EPF PCI: endpoint: Use a separate lock for protecting epc->pci_epf list PCI: tegra194: Move dw_pcie_ep_linkup() to threaded IRQ handler PCI: dra7xx: Use threaded IRQ handler for "dra7xx-pcie-main" IRQ	2023-02-22 13:47:28 -06:00
Bjorn Helgaas	191b410188	Merge branch 'pci/virtualization' - Avoid FLR for AMD FCH AHCI adapters to avoid a hardware defect (Damien Le Moal) - Add ACS quirk for Wangxun NICs that don't allow peer-to-peer between functions, but don't advertise an ACS Capability (Mengyuan Lou) * pci/virtualization: PCI: Add ACS quirk for Wangxun NICs PCI: Avoid FLR for AMD FCH AHCI adapters	2023-02-22 13:47:28 -06:00
Bjorn Helgaas	ebdce9e3d0	Merge branch 'pci/resource' - Realign space as required by bridge windows after dividing it up (Mika Westerberg) - Account for space required by other devices on the bus before distributing it all to bridges (Mika Westerberg) - Distribute spare resources to root bus devices as well as to other hotplug bridges (Mika Westerberg) - Fix bug that dropped root bus resources that end at zero, e.g., a host bridge that leads only to bus 00 (Geert Uytterhoeven) * pci/resource: PCI: Fix dropping valid root bus resources with .end = zero PCI: Distribute available resources for root buses, too PCI: Take other bus devices into account when distributing resources PCI: Align extra resources for hotplug bridges properly	2023-02-22 13:47:27 -06:00
Bjorn Helgaas	0b7af1ddcf	Merge branch 'pci/reset' - Always observe reset delay when waking devices from D3cold, e.g., after system sleep, regardless of whether we're allowed to runtime-suspend to D3cold (Lukas Wunner) - Unify reset and resume delays to wait for downstream devices after a bridge reset (Lukas Wunner) - Wait for downstream devices after a DPC-induced bridge reset (Lukas Wunner) * pci/reset: PCI/DPC: Await readiness of secondary bus after reset PCI: Unify delay handling for reset and resume PCI/PM: Observe reset delay irrespective of bridge_d3	2023-02-22 13:47:27 -06:00
Bjorn Helgaas	08a67024a0	Merge branch 'pci/pm' - Account for _S0W when deciding whether to put bridges in D3 to avoid missing hotplug events (Rafael J. Wysocki) * pci/pm: PCI/ACPI: Account for _S0W of the target bridge in acpi_pci_bridge_d3()	2023-02-22 13:47:26 -06:00
Bjorn Helgaas	7260675a52	Merge branch 'pci/p2pdma' - Annotate RCU dereference (Logan Gunthorpe) * pci/p2pdma: PCI/P2PDMA: Annotate RCU dereference	2023-02-22 13:47:26 -06:00
Bjorn Helgaas	881766fe0d	Merge branch 'pci/kbuild' - Remove MODULE_LICENSE from boolean drivers so they don't look like modules so modprobe will complain about them (Nick Alcock) * pci/kbuild: PCI: Remove MODULE_LICENSE so boolean drivers don't look like modules	2023-02-22 13:47:25 -06:00
Bjorn Helgaas	72d083a60a	Merge branch 'pci/iov' - Enlarge virtfn sysfs name buffer to prevent buffer overflow (Alexey V. Vissarionov) * pci/iov: PCI/IOV: Enlarge virtfn sysfs name buffer	2023-02-22 13:47:25 -06:00
Bjorn Helgaas	fec93576f7	Merge branch 'pci/hotplug' - Add quirk to work around Qualcomm hardware defect in Command Completed signaling (Manivannan Sadhasivam) - Remove locking to allow devices to be marked as disconnected immediately instead of waiting for concurrent bind/unbind to complete (Lukas Wunner) * pci/hotplug: PCI: hotplug: Allow marking devices as disconnected during bind/unbind PCI: pciehp: Add Qualcomm quirk for Command Completed erratum	2023-02-22 13:47:25 -06:00
Bjorn Helgaas	a17613298f	Merge branch 'pci/enumeration' - Implement portdrv .shutdown() method that calls service driver .remove() methods (which disables interrupt generation as required by .shutdown()), but doesn't disable bus mastering (which hangs on Loongson LS7A because of a hardware defect) (Huacai Chen) - Prevent MRRS increases for devices below Loongson LS7A to avoid hardware limitations (Huacai Chen) - Ignore devices with a firmware (DT/ACPI) node that says the device is disabled (Rob Herring) * pci/enumeration: PCI: Honor firmware's device disabled status PCI: loongson: Add more devices that need MRRS quirk PCI: loongson: Prevent LS7A MRRS increases PCI/portdrv: Prevent LS7A Bus Master clearing on shutdown	2023-02-22 13:47:24 -06:00
Serge Semin	939fbcd568	PCI: dwc: Add Root Port and Endpoint controller eDMA engine support Since the DW eDMA core now supports eDMA controllers embedded in locally accessible DW PCIe Root Ports and Endpoints, register these controllers when possible. To do that the DW PCIe core driver needs to perform some preparations first. First of all, it needs to find the eDMA controller CSRs base address, whether they are accessible over the Port Logic or iATU unrolled space. Afterwards it can try to auto-detect the eDMA controller availability and number of read/write channels. If none are found the procedure silently returns without error. Secondly, the platform is supposed to provide either combined or per-channel IRQ signals. If no valid IRQs set is found, the procedure returns without error to be backward compatible with platforms where DW PCIe controllers have eDMA but lack the IRQ description. Finally, before actually probing the eDMA device we need to allocate LLP items buffers. After that the DW eDMA can be registered. If registration is successful, a message regarding the number of detected Read/Write eDMA channels will be printed to the system as is done for the iATU settings. Note: the DW PCI controller driver (either host or endpoint mode) is currently always built-in, so if the DW eDMA core is built as a module (CONFIG_DW_EDMA=m), eDMA controllers will not be registered even if the dw-edma module is later loaded. Link: https://lore.kernel.org/r/20230113171409.30470-28-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Acked-by: Vinod Koul <vkoul@kernel.org>	2023-02-22 13:46:14 -06:00
Serge Semin	68373f2c0f	PCI: bt1: Set 64-bit DMA mask The DW PCIe Root Port IP core is synthesized with the 64-bit AXI address bus. Since the device is also equipped with the eDMA engine, explicitly set the device DMA mask so DMA engine clients can allocate data buffers anywhere in the 64-bit memory space. Link: https://lore.kernel.org/r/20230113171409.30470-27-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-22 13:46:14 -06:00
Serge Semin	6c784e21b3	PCI: dwc: Restrict only coherent DMA mask for MSI address allocation The MSI target address must be in the lowest 4GB memory to support PCI peripherals without 64-bit MSI support. Since the allocation is done from DMA coherent memory, set only the coherent DMA mask, leaving the streaming DMA mask alone. Thus streaming DMA operations will work with no artificial limitations. It will be specifically useful for the eDMA-capable controllers so the corresponding DMA engine clients would map the DMA buffers with no need for SWIOTLB for buffers allocated above 4GB. Add a brief comment about the reason allocating the MSI target address below 4GB. Link: https://lore.kernel.org/r/20230113171409.30470-26-Sergey.Semin@baikalelectronics.ru Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Robin Murphy <robin.murphy@arm.com>	2023-02-22 13:46:14 -06:00
Linus Torvalds	b8878e5a5c	hyperv-next for v6.3. -----BEGIN PGP SIGNATURE----- iQFHBAABCAAxFiEEIbPD0id6easf0xsudhRwX5BBoF4FAmPzgDgTHHdlaS5saXVA a2VybmVsLm9yZwAKCRB2FHBfkEGgXrc7CACfG4SSd8KkWU/y8Q66Irxdau0a3ETD KL4UNRKGIyKujufgFsme79O6xVSSsCNSay449wk20hqn8lnwbSRi9pUwmLn29hfd CMFleWIqgwGFfC1do5DRF1vrt1siuG/jVE07mWsEwuY2iHx/es+H7LiQKidhkndZ DhXRqoi7VYiJv5fRSumpkUJrMZiI96o9Mk09HUksdMwCn3+7RQEqHnlTH5KOozKF iMroDB72iNw5Na/USZwWL2EDRptENam3lFkPBeDPqNw0SbG4g65JGPR9DSa0Lkbq AGCJQkdU33mcYQG5MY7R4K1evufpOl/apqLW7h92j45Znr9ok6Vr2c1R =J1VT -----END PGP SIGNATURE----- Merge tag 'hyperv-next-signed-20230220' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - allow Linux to run as the nested root partition for Microsoft Hypervisor (Jinank Jain and Nuno Das Neves) - clean up the return type of callback functions (Dawei Li) * tag 'hyperv-next-signed-20230220' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: x86/hyperv: Fix hv_get/set_register for nested bringup Drivers: hv: Make remove callback of hyperv driver void returned Drivers: hv: Enable vmbus driver for nested root partition x86/hyperv: Add an interface to do nested hypercalls Drivers: hv: Setup synic registers in case of nested root partition x86/hyperv: Add support for detecting nested hypervisor	2023-02-21 16:59:23 -08:00
Linus Torvalds	8cc01d43f8	RCU pull request for v6.3 This pull request contains the following branches: doc.2023.01.05a: Documentation updates. fixes.2023.01.23a: Miscellaneous fixes, perhaps most notably: o Throttling callback invocation based on the number of callbacks that are now ready to invoke instead of on the total number of callbacks. o Several patches that suppress false-positive boot-time diagnostics, for example, due to lockdep not yet being initialized. o Make expedited RCU CPU stall warnings dump stacks of any tasks that are blocking the stalled grace period. (Normal RCU CPU stall warnings have doen this for mnay years.) o Lazy-callback fixes to avoid delays during boot, suspend, and resume. (Note that lazy callbacks must be explicitly enabled, so this should not (yet) affect production use cases.) kvfree.2023.01.03a: Cause kfree_rcu() and friends to take advantage of polled grace periods, thus reducing memory footprint by almost two orders of magnitude, admittedly on a microbenchmark. This series also begins the transition from kfree_rcu(p) to kfree_rcu_mightsleep(p). This transition was motivated by bugs where kfree_rcu(p), which can block, was typed instead of the intended kfree_rcu(p, rh). srcu.2023.01.03a: SRCU updates, perhaps most notably fixing a bug that causes SRCU to fail when booted on a system with a non-zero boot CPU. This surprising situation actually happens for kdump kernels on the powerpc architecture. It also adds an srcu_down_read() and srcu_up_read(), which act like srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side critical section to be handed off from one task to another. srcu-always.2023.02.02a: Cleans up the now-useless SRCU Kconfig option. There are a few more commits that are not yet acked or pulled into maintainer trees, and these will be in a pull request for a later merge window. tasks.2023.01.03a: RCU-tasks updates, perhaps most notably these fixes: o A strange interaction between PID-namespace unshare and the RCU-tasks grace period that results in a low-probability but very real hang. o A race between an RCU tasks rude grace period on a single-CPU system and CPU-hotplug addition of the second CPU that can result in a too-short grace period. o A race between shrinking RCU tasks down to a single callback list and queuing a new callback to some other CPU, but where that queuing is delayed for more than an RCU grace period. This can result in that callback being stranded on the non-boot CPU. torture.2023.01.05a: Torture-test updates and fixes. torturescript.2023.01.03a: Torture-test scripting updates and fixes. stall.2023.01.09a: Provide additional RCU CPU stall-warning information in kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute timeout limit for expedited RCU CPU stall warnings. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmPq29UTHHBhdWxtY2tA a2VybmVsLm9yZwAKCRCevxLzctn7jAhVEACEAKJY1VJ9IUqz7CwzAYkzgRJfiygh oDUXmlqtm6ew9pr2GdLUVCVsUSldzBc0K7Djb/G1niv4JPs+v7YwupIV33+UbStU Qxt6ztTdxc4lKospLm1+2vF9ZdzVEmiP4wVCc4iDarv5FM3FpWSTNc8+L7qmlC+X myjv+GqMTxkXZBvYJOgJGFjDwN8noTd7Fr3mCCVLFm3PXMDa7tcwD6HRP5AqD2N8 qC5M6LEqepKVGmz0mYMLlSN1GPaqIsEcexIFEazRsPEivPh/iafyQCQ/cqxwhXmV vEt7u+dXGZT/oiDq9cJ+/XRDS2RyKIS6dUE14TiiHolDCn1ONESahfA/gXWKykC2 BaGPfjWXrWv/hwbeZ+8xEdkAvTIV92tGpXir9Fby1Z5PjP3balvrnn6hs5AnQBJb NdhRPLzy/dCnEF+CweAYYm1qvTo8cd5nyiNwBZHn7rEAIu3Axrecag1rhFl3AJ07 cpVMQXZtkQVa2X8aIRTUC+ijX6yIqNaHlu0HqNXgIUTDzL4nv5cMjOMzpNQP9/dZ FwAMZYNiOk9IlMiKJ8ZiVcxeiA8ouIBlkYM3k6vGrmiONZ7a/EV/mSHoJqI8bvqr AxUIJ2Ayhg3bxPboL5oKgCiLql0A7ZVvz6quX6McitWGMgaSvel1fDzT3TnZd41e 4AFBFd/+VedUGg== =bBYK -----END PGP SIGNATURE----- Merge tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull RCU updates from Paul McKenney: - Documentation updates - Miscellaneous fixes, perhaps most notably: - Throttling callback invocation based on the number of callbacks that are now ready to invoke instead of on the total number of callbacks - Several patches that suppress false-positive boot-time diagnostics, for example, due to lockdep not yet being initialized - Make expedited RCU CPU stall warnings dump stacks of any tasks that are blocking the stalled grace period. (Normal RCU CPU stall warnings have done this for many years) - Lazy-callback fixes to avoid delays during boot, suspend, and resume. (Note that lazy callbacks must be explicitly enabled, so this should not (yet) affect production use cases) - Make kfree_rcu() and friends take advantage of polled grace periods, thus reducing memory footprint by almost two orders of magnitude, admittedly on a microbenchmark This also begins the transition from kfree_rcu(p) to kfree_rcu_mightsleep(p). This transition was motivated by bugs where kfree_rcu(p), which can block, was typed instead of the intended kfree_rcu(p, rh) - SRCU updates, perhaps most notably fixing a bug that causes SRCU to fail when booted on a system with a non-zero boot CPU. This surprising situation actually happens for kdump kernels on the powerpc architecture This also adds an srcu_down_read() and srcu_up_read(), which act like srcu_read_lock() and srcu_read_unlock(), but allow an SRCU read-side critical section to be handed off from one task to another - Clean up the now-useless SRCU Kconfig option There are a few more commits that are not yet acked or pulled into maintainer trees, and these will be in a pull request for a later merge window - RCU-tasks updates, perhaps most notably these fixes: - A strange interaction between PID-namespace unshare and the RCU-tasks grace period that results in a low-probability but very real hang - A race between an RCU tasks rude grace period on a single-CPU system and CPU-hotplug addition of the second CPU that can result in a too-short grace period - A race between shrinking RCU tasks down to a single callback list and queuing a new callback to some other CPU, but where that queuing is delayed for more than an RCU grace period. This can result in that callback being stranded on the non-boot CPU - Torture-test updates and fixes - Torture-test scripting updates and fixes - Provide additional RCU CPU stall-warning information in kernels built with CONFIG_RCU_CPU_STALL_CPUTIME=y, and restore the full five-minute timeout limit for expedited RCU CPU stall warnings * tag 'rcu.2023.02.10a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits) rcu/kvfree: Add kvfree_rcu_mightsleep() and kfree_rcu_mightsleep() kernel/notifier: Remove CONFIG_SRCU init: Remove "select SRCU" fs/quota: Remove "select SRCU" fs/notify: Remove "select SRCU" fs/btrfs: Remove "select SRCU" fs: Remove CONFIG_SRCU drivers/pci/controller: Remove "select SRCU" drivers/net: Remove "select SRCU" drivers/md: Remove "select SRCU" drivers/hwtracing/stm: Remove "select SRCU" drivers/dax: Remove "select SRCU" drivers/base: Remove CONFIG_SRCU rcu: Disable laziness if lazy-tracking says so rcu: Track laziness during boot and suspend rcu: Remove redundant call to rcu_boost_kthread_setaffinity() rcu: Allow up to five minutes expedited RCU CPU stall-warning timeouts rcu: Align the output of RCU CPU stall warning messages rcu: Add RCU stall diagnosis information sched: Add helper nr_context_switches_cpu() ...	2023-02-21 10:45:51 -08:00
Reinette Chatre	e6cc6f1755	PCI/MSI: Clarify usage of pci_msix_free_irq() pci_msix_free_irq() is used to free an interrupt on a PCI/MSI-X interrupt domain. The API description specifies that the interrupt to be freed was allocated via pci_msix_alloc_irq_at(). This description limits the usage of pci_msix_free_irq() since pci_msix_free_irq() can also be used to free MSI-X interrupts allocated with, for example, pci_alloc_irq_vectors(). Remove the text stating that the interrupt to be freed had to be allocated with pci_msix_alloc_irq_at(). The needed struct msi_map need not be from pci_msix_alloc_irq_at() but can be created from scratch using pci_irq_vector() to obtain the Linux IRQ number. Highlight that pci_msix_free_irq() cannot be used to disable MSI-X to guide users that, for example, pci_free_irq_vectors() remains to be needed. Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Link: https://lore.kernel.org/lkml/87r0xsd8j4.ffs@tglx Link: https://lore.kernel.org/r/4c3e7a50d6e70f408812cd7ab199c6b4b326f9de.1676408572.git.reinette.chatre@intel.com	2023-02-21 08:25:14 +01:00
Alvaro Karsz	d089d69cc1	PCI: Avoid FLR for SolidRun SNET DPU rev 1 This patch fixes a FLR bug on the SNET DPU rev 1 by setting the PCI_DEV_FLAGS_NO_FLR_RESET flag. As there is a quirk to avoid FLR (quirk_no_flr), I added a new quirk to check the rev ID before calling to quirk_no_flr. Without this patch, a SNET DPU rev 1 may hang when FLR is applied. Signed-off-by: Alvaro Karsz <alvaro.karsz@solid-run.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Message-Id: <20230110165638.123745-3-alvaro.karsz@solid-run.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>	2023-02-20 19:26:55 -05:00
Nick Alcock	f98954b293	PCI: Remove MODULE_LICENSE so boolean drivers don't look like modules Since `8b41fc4454` ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are used to identify modules. As a consequence, MODULE_LICENSE in non-modules causes modprobe to misidentify the object file as a module when it is not, and modprobe might succeed rather than failing with a suitable error message. For tristate modules that can be either built-in or loaded at runtime, modprobe succeeds in both cases: # modprobe ext4 [exit status zero if CONFIG_EXT4_FS=y or =m] For boolean modules like the Standard Hot Plug Controller driver (shpchp) that cannot be loaded at runtime, modprobe should always fail like this: # modprobe shpchp modprobe: FATAL: Module shpchp not found in directory /lib/modules/... [exit status non-zero regardless of CONFIG_HOTPLUG_PCI_SHPC] but prior to this commit, shpchp_core.c contained MODULE_LICENSE, so "modprobe shpchp" silently succeeded when it should have failed. Remove MODULE_LICENSE in files that cannot be built as modules. [bhelgaas: commit log, squash] Suggested-by: Luis Chamberlain <mcgrof@kernel.org> Link: https://lore.kernel.org/r/20230216152410.4312-1-nick.alcock@oracle.com/ Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com> Cc: Rob Herring <robh@kernel.org> Cc: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-02-17 08:47:58 -06:00
Lukas Bulwahn	9574d57f2d	PCI: hv: Drop duplicate PCI_MSI dependency Commit `a474d3fbe2` ("PCI/MSI: Get rid of PCI_MSI_IRQ_DOMAIN") removed PCI_MSI_IRQ_DOMAIN and made all previous references to it refer to PCI_MSI instead. PCI_HYPERV_INTERFACE already depended on PCI_MSI && PCI_MSI_IRQ_DOMAIN, so we ended up with a redundant dependency on PCI_MSI && PCI_MSI. Drop the duplicate. No functional change. Just a stylistic clean-up. Link: https://lore.kernel.org/r/20221215101310.9135-1-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-16 17:20:42 -06:00
Logan Gunthorpe	6606f4c3c4	PCI/P2PDMA: Annotate RCU dereference A dereference of the __rcu pointer was noticed by sparse: drivers/pci/p2pdma.c:199:44: sparse: sparse: dereference of noderef expression Dereference the __rcu pointer using rcu_dereference_protected() instead of accessing it directly. It's safe to use rcu_dereference_protected() because a reference is held on the pgmap's percpu reference counter and thus it cannot disappear. Link: https://lore.kernel.org/r/20230209172953.4597-1-logang@deltatee.com Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>	2023-02-16 16:31:12 -06:00
Thomas Weißschuh	1047377754	PCI/sysfs: Constify struct kobj_type pci_slot_ktype Since commit `ee6d3dd4ed` ("driver core: make kobj_type constant.") the driver core allows the usage of const struct kobj_type. Take advantage of this to constify the structure definition to prevent modification at runtime. Link: https://lore.kernel.org/r/20230216-kobj_type-pci-v1-1-46a63c8612b5@weissschuh.net Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-16 12:00:25 -06:00
Lukas Wunner	74ff8864cc	PCI: hotplug: Allow marking devices as disconnected during bind/unbind On surprise removal, pciehp_unconfigure_device() and acpiphp's trim_stale_devices() call pci_dev_set_disconnected() to mark removed devices as permanently offline. Thereby, the PCI core and drivers know to skip device accesses. However pci_dev_set_disconnected() takes the device_lock and thus waits for a concurrent driver bind or unbind to complete. As a result, the driver's ->probe and ->remove hooks have no chance to learn that the device is gone. That doesn't make any sense, so drop the device_lock and instead use atomic xchg() and cmpxchg() operations to update the device state. As a byproduct, an AB-BA deadlock reported by Anatoli is fixed which occurs on surprise removal with AER concurrently performing a bus reset. AER bus reset: INFO: task irq/26-aerdrv:95 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule rwsem_down_write_slowpath down_write_nested pciehp_reset_slot # acquires reset_lock pci_reset_hotplug_slot pci_slot_reset # acquires device_lock pci_bus_error_reset aer_root_reset pcie_do_recovery aer_process_err_devices aer_isr pciehp surprise removal: INFO: task irq/26-pciehp:96 blocked for more than 120 seconds. Tainted: G W 6.2.0-rc3-custom-norework-jan11+ schedule_preempt_disabled __mutex_lock mutex_lock_nested pci_dev_set_disconnected # acquires device_lock pci_walk_bus pciehp_unconfigure_device pciehp_disable_slot pciehp_handle_presence_or_link_change pciehp_ist # acquires reset_lock Link: https://bugzilla.kernel.org/show_bug.cgi?id=215590 Fixes: `a6bd101b8f` ("PCI: Unify device inaccessible") Link: https://lore.kernel.org/r/3dc88ea82bdc0e37d9000e413d5ebce481cbd629.1674205689.git.lukas@wunner.de Reported-by: Anatoli Antonovitch <anatoli.antonovitch@amd.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v4.20+ Cc: Keith Busch <kbusch@kernel.org>	2023-02-15 15:01:01 -06:00
Manivannan Sadhasivam	82b34b0800	PCI: pciehp: Add Qualcomm quirk for Command Completed erratum The Qualcomm PCI bridge device (Device ID 0x010e) found in chipsets such as SC8280XP used in Lenovo Thinkpad X13s, does not set the Command Completed bit unless writes to the Slot Command register change "Control" bits. This results in timeouts like below during boot and resume from suspend: pcieport 0002:00:00.0: pciehp: Timeout on hotplug command 0x03c0 (issued 2020 msec ago) ... pcieport 0002:00:00.0: pciehp: Timeout on hotplug command 0x13f1 (issued 107724 msec ago) Add the device to the Command Completed quirk to mark commands "completed" immediately unless they change the "Control" bits. Link: https://lore.kernel.org/r/20230213144922.89982-1-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-14 11:47:49 -06:00
Robert Marko	f356132229	PCI: qcom: Add IPQ8074 Gen3 port support IPQ8074 has one Gen2 and one Gen3 port, with Gen2 port already supported. Add compatible for Gen3 port which uses the same controller as IPQ6018. Link: https://lore.kernel.org/r/20230113164449.906002-7-robimarko@gmail.com Signed-off-by: Robert Marko <robimarko@gmail.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-14 11:41:04 -06:00
Johan Hovold	997e010de9	PCI: qcom: Fix host-init error handling Implement the new host_deinit() callback so that the PHY is powered off and regulators and clocks are disabled also on late host-init errors. Link: https://lore.kernel.org/r/20221017114705.8277-2-johan+linaro@kernel.org Fixes: `82a823833f` ("PCI: qcom: Add Qualcomm PCIe controller driver") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Manivannan Sadhasivam <mani@kernel.org>	2023-02-14 11:41:03 -06:00
Dmitry Baryshkov	720e0d91c9	PCI: qcom: Add SM8350 support Add support for the PCIe host on Qualcomm SM8350 platform. Link: https://lore.kernel.org/r/20221118233242.2904088-4-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Bjorn Andersson <andersson@kernel.org>	2023-02-14 11:41:03 -06:00
Mengyuan Lou	a2b9b123cc	PCI: Add ACS quirk for Wangxun NICs Wangxun has verified there is no peer-to-peer between functions for the below selection of SFxxx, RP1000 and RP2000 NICS. They may be multi-function devices, but the hardware does not advertise ACS capability. Add an ACS quirk for these devices so the functions can be in independent IOMMU groups. Link: https://lore.kernel.org/r/20230207102419.44326-1-mengyuanlou@net-swift.com Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-13 18:05:59 -06:00
Geert Uytterhoeven	9d8ba74a18	PCI: Fix dropping valid root bus resources with .end = zero On r8a7791/koelsch: kmemleak: 1 new suspected memory leaks (see /sys/kernel/debug/kmemleak) # cat /sys/kernel/debug/kmemleak unreferenced object 0xc3a34e00 (size 64): comm "swapper/0", pid 1, jiffies 4294937460 (age 199.080s) hex dump (first 32 bytes): b4 5d 81 f0 b4 5d 81 f0 c0 b0 a2 c3 00 00 00 00 .]...].......... 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<fe3aa979>] __kmalloc+0xf0/0x140 [<34bd6bc0>] resource_list_create_entry+0x18/0x38 [<767046bc>] pci_add_resource_offset+0x20/0x68 [<b3f3edf2>] devm_of_pci_get_host_bridge_resources.constprop.0+0xb0/0x390 When coalescing two resources for a contiguous aperture, the second resource is enlarged to cover the full contiguous range, while the first resource is marked invalid. This invalidation is done by clearing the flags, start, and end members. When adding the initial resources to the bus later, invalid resources are skipped. Unfortunately, the check for an invalid resource considers only the end member, causing false positives. E.g. on r8a7791/koelsch, root bus resource 0 ("bus 00") is skipped, and no longer registered with pci_bus_insert_busn_res() (causing the memory leak), nor printed: pci-rcar-gen2 ee090000.pci: host bridge /soc/pci@ee090000 ranges: pci-rcar-gen2 ee090000.pci: MEM 0x00ee080000..0x00ee08ffff -> 0x00ee080000 pci-rcar-gen2 ee090000.pci: PCI: revision 11 pci-rcar-gen2 ee090000.pci: PCI host bridge to bus 0000:00 -pci_bus 0000:00: root bus resource [bus 00] pci_bus 0000:00: root bus resource [mem 0xee080000-0xee08ffff] Fix this by only skipping resources where all of the flags, start, and end members are zero. Fixes: `7c3855c423` ("PCI: Coalesce host bridge contiguous apertures") Link: https://lore.kernel.org/r/da0fcd5e86c74239be79c7cb03651c0fce31b515.1676036673.git.geert+renesas@glider.be Tested-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com>	2023-02-13 16:40:45 -06:00
Manivannan Sadhasivam	f5edd8715e	PCI: endpoint: Use link_up() callback in place of LINK_UP notifier As a part of the transition towards callback mechanism for signalling the events from EPC to EPF, let's use the link_up() callback in the place of the LINK_UP notifier. This also removes the notifier support completely from the PCI endpoint framework. Link: https://lore.kernel.org/linux-pci/20230124071158.5503-6-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Acked-by: Kishon Vijay Abraham I <kishon@kernel.org>	2023-02-14 07:27:32 +09:00
Manivannan Sadhasivam	838125b07e	PCI: endpoint: Use callback mechanism for passing events from EPC to EPF Instead of using the notifiers for passing the events from EPC to EPF, let's introduce a callback based mechanism where the EPF drivers can populate relevant callbacks for EPC events they want to subscribe. The use of notifiers in kernel is not recommended if there is a real link between the sender and receiver, like in this case. Also, the existing atomic notifier forces the notification functions to be in atomic context while the caller may be in non-atomic context. For instance, the two in-kernel users of the notifiers, pcie-qcom and pcie-tegra194, both are calling the notifier functions in non-atomic context (from threaded IRQ handlers). This creates a sleeping in atomic context issue with the existing EPF_TEST driver that calls the EPC APIs that may sleep. For all these reasons, let's get rid of the notifier chains and use the simple callback mechanism for signalling the events from EPC to EPF drivers. This preserves the context of the caller and avoids the latency of going through a separate interface for triggering the notifications. As a first step of the transition, the core_init() callback is introduced in this commit, that'll replace the existing CORE_INIT notifier used for signalling the init complete event from EPC. During the occurrence of the event, EPC will go over the list of EPF drivers attached to it and will call the core_init() callback if available. Link: https://lore.kernel.org/linux-pci/20230124071158.5503-5-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Acked-by: Kishon Vijay Abraham I <kishon@kernel.org>	2023-02-14 07:27:25 +09:00
Manivannan Sadhasivam	d6dd5bafaa	PCI: endpoint: Use a separate lock for protecting epc->pci_epf list The EPC controller maintains a list of EPF drivers added to it. For protecting this list against the concurrent accesses, the epc->lock (used for protecting epc_ops) has been used so far. Since there were no users trying to use epc_ops and modify the pci_epf list simultaneously, this was not an issue. But with the addition of callback mechanism for passing the events, this will be a problem. Because the pci_epf list needs to be iterated first for getting hold of the EPF driver and then the relevant event specific callback needs to be called for the driver. If the same epc->lock is used, then it will result in a deadlock scenario. For instance, ... mutex_lock(&epc->lock); list_for_each_entry(epf, &epc->pci_epf, list) { epf->event_ops->core_init(epf); \| \|-> pci_epc_set_bar(); \| \|-> mutex_lock(&epc->lock) # DEADLOCK ... So to fix this issue, use a separate lock called "list_lock" for protecting the pci_epf list against the concurrent accesses. This lock will also be used by the callback mechanism. Link: https://lore.kernel.org/linux-pci/20230124071158.5503-4-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>	2023-02-14 07:27:15 +09:00
Manivannan Sadhasivam	c2cc5cdda4	PCI: tegra194: Move dw_pcie_ep_linkup() to threaded IRQ handler dw_pcie_ep_linkup() may take more time to execute depending on the EPF driver implementation. Calling this API in the hard IRQ handler is not encouraged since the hard IRQ handlers are supposed to complete quickly. So move the dw_pcie_ep_linkup() call to threaded IRQ handler. Link: https://lore.kernel.org/linux-pci/20230124071158.5503-3-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Vidya Sagar <vidyas@nvidia.com>	2023-02-14 07:26:56 +09:00
Manivannan Sadhasivam	da87d35a6e	PCI: dra7xx: Use threaded IRQ handler for "dra7xx-pcie-main" IRQ The "dra7xx-pcie-main" hard IRQ handler is just printing the IRQ status and calling the dw_pcie_ep_linkup() API if LINK_UP status is set. But the execution of dw_pcie_ep_linkup() depends on the EPF driver and may take more time depending on the EPF implementation. In general, hard IRQ handlers are supposed to return quickly and not block for so long. Moreover, there is no real need of the current IRQ handler to be a hard IRQ handler. So switch to the threaded IRQ handler for the "dra7xx-pcie-main" IRQ. Link: https://lore.kernel.org/linux-pci/20230124071158.5503-2-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Acked-by: Kishon Vijay Abraham I <kishon@ti.com>	2023-02-14 07:26:45 +09:00
Rob Herring	6fffbc7ae1	PCI: Honor firmware's device disabled status If a device has a firmware node (DT/ACPI), and the device is marked disabled, that is currently ignored. Add a check for this condition and bail out creating the pci_dev. This assumes the config space for the device can still be accessed because they already have by this point in order to identify the device. Link: https://lore.kernel.org/r/20230210164351.2687475-1-robh@kernel.org Tested-by: Binbin Zhou <zhoubinbin@loongson.cn> Signed-off-by: Rob Herring <robh@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Liu Peibao <liupeibao@loongson.cn> Cc: Huacai Chen <chenhuacai@loongson.cn>	2023-02-13 15:29:56 -06:00
Huacai Chen	c768f8c5f4	PCI: loongson: Add more devices that need MRRS quirk Loongson-2K SOC and LS7A2000 chipset add new PCI IDs that need MRRS quirk. Add them. Link: https://lore.kernel.org/r/20230211023321.3530080-1-chenhuacai@loongson.cn Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-13 15:29:27 -06:00
Linus Torvalds	4cfd5afcd8	pci-v6.2-fixes-2 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmPmuwEUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vzEYg//XHHddqDRiZmx9McETDAi33rJ9DDo CMCwiydUzGlDl/IDnBxwcmq0K5wiA5jFvXlRFmzHfnHGpWpRf6ntcT436QnhKe4G /DAXxVdZGWr079m7s4NKjByDunhkkkT/elapFCtZTwXxMkUvbprM0ozMdtSMnC/M RDCJKfaV2CKUkl/5Mk9Iw3vzrr62PP8fVHHMIr+6O39frZ2+MrzYCgpGkW0pubmT He0gmeVnNFzR6qB1GraXVNwlapjPjzvHe1IggDDLJRxM4+sz8qKJz0vKew10JwSo R5s8ACfTNtHwY45af1EWIeO9BoGD3soNLvWmK/5uNrCWJx9wnczQuz4b/Km2y02Y KCJaudiC6EfAzu5gCSgao3VZ/EQ45sHrYZN9qiyDujOgAUUPl0oonwa1HW/1WUSH Pd/ff9o78vASxdZP1o1hF0davNET1HOsvXGxQj71TJLXVsB2pifWvAoNocHHnpoe cPCix8t3c4pgXzI0RG04tcfqGWAgsaVz73SdU0/g5qk+hPRvypjcY1lw6U66sk9f /ZNII5fSX6hIWTetD27JiCZNOxJq1jikxOD4/LZizMTjdZYf6VxjDxkIaLS99pZw RCOQ8chKVemr12lD//8eFUJJvblug2aTlHIwFnMuKiavy6pL5Sm1zGMBrqhYmUSO pkNXzFaZe+GyF3k= =NSFX -----END PGP SIGNATURE----- Merge tag 'pci-v6.2-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci Pull PCI fixes from Bjorn Helgaas: - Move to a shared PCI git tree (Bjorn Helgaas) - Add Krzysztof Wilczyński as another PCI maintainer (Lorenzo Pieralisi) - Revert a couple ASPM patches to fix suspend/resume regressions (Bjorn Helgaas) * tag 'pci-v6.2-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming" Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume" MAINTAINERS: Promote Krzysztof to PCI controller maintainer MAINTAINERS: Move to shared PCI tree	2023-02-10 14:18:48 -08:00
Bjorn Helgaas	ff209ecc37	Revert "PCI/ASPM: Refactor L1 PM Substates Control Register programming" This reverts commit `5e85eba6f5`. Thomas Witt reported that `5e85eba6f5` ("PCI/ASPM: Refactor L1 PM Substates Control Register programming") broke suspend/resume on a Tuxedo Infinitybook S 14 v5, which seems to use a Clevo L140CU Mainboard. The main symptom is: iwlwifi 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible nvme 0000:03:00.0: Unable to change power state from D3hot to D0, device inaccessible and the machine is only partially usable after resume. It can't run dmesg and can't do a clean reboot. This happens on every suspend/resume cycle. Revert `5e85eba6f5` until we can figure out the root cause. Fixes: `5e85eba6f5` ("PCI/ASPM: Refactor L1 PM Substates Control Register programming") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877 Reported-by: Thomas Witt <kernel@witt.link> Tested-by: Thomas Witt <kernel@witt.link> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v6.1+ Cc: Vidya Sagar <vidyas@nvidia.com>	2023-02-10 15:30:24 -06:00
Bjorn Helgaas	a7152be79b	Revert "PCI/ASPM: Save L1 PM Substates Capability for suspend/resume" This reverts commit `4ff116d0d5`. Tasev Nikola and Mark Enriquez reported that resume from suspend was broken in v6.1-rc1. Tasev bisected to `a47126ec29` ("PCI/PTM: Cache PTM Capability offset"), but we can't figure out how that could be related. Mark saw the same symptoms and bisected to `4ff116d0d5` ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume"), which does have a connection: it restores L1 Substates configuration while ASPM L1 may be enabled: pci_restore_state pci_restore_aspm_l1ss_state aspm_program_l1ss pci_write_config_dword(PCI_L1SS_CTL1, ctl1) # L1SS restore pci_restore_pcie_state pcie_capability_write_word(PCI_EXP_LNKCTL, cap[i++]) # L1 restore which is a problem because PCIe r6.0, sec 5.5.4, requires that: If setting either or both of the enable bits for ASPM L1 PM Substates, both ports must be configured as described in this section while ASPM L1 is disabled. Separately, Thomas Witt reported that `5e85eba6f5` ("PCI/ASPM: Refactor L1 PM Substates Control Register programming") broke suspend/resume, and it depends on `4ff116d0d5`. Revert `4ff116d0d5` ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") to fix the resume issue and enable revert of `5e85eba6f5` to fix the issue Thomas reported. Note that reverting `4ff116d0d5` means L1 Substates config may be lost on suspend/resume. As far as we know the system will use more power but will still work correctly. Fixes: `4ff116d0d5` ("PCI/ASPM: Save L1 PM Substates Capability for suspend/resume") Link: https://bugzilla.kernel.org/show_bug.cgi?id=216782 Link: https://bugzilla.kernel.org/show_bug.cgi?id=216877 Reported-by: Tasev Nikola <tasev.stefanoska@skynet.be> Reported-by: Mark Enriquez <enriquezmark36@gmail.com> Reported-by: Thomas Witt <kernel@witt.link> Tested-by: Mark Enriquez <enriquezmark36@gmail.com> Tested-by: Thomas Witt <kernel@witt.link> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v6.1+ Cc: Vidya Sagar <vidyas@nvidia.com>	2023-02-10 15:29:53 -06:00
Lukas Wunner	53b54ad074	PCI/DPC: Await readiness of secondary bus after reset pci_bridge_wait_for_secondary_bus() is called after a Secondary Bus Reset, but not after a DPC-induced Hot Reset. As a result, the delays prescribed by PCIe r6.0 sec 6.6.1 are not observed and devices on the secondary bus may be accessed before they're ready. One affected device is Intel's Ponte Vecchio HPC GPU. It comprises a PCIe switch whose upstream port is not immediately ready after reset. Because its config space is restored too early, it remains in D0uninitialized, its subordinate devices remain inaccessible and DPC recovery fails with messages such as: i915 0000:8c:00.0: can't change power state from D3cold to D0 (config space inaccessible) intel_vsec 0000:8e:00.1: can't change power state from D3cold to D0 (config space inaccessible) pcieport 0000:89:02.0: AER: device recovery failed Fix it. Link: https://lore.kernel.org/r/9f5ff00e1593d8d9a4b452398b98aa14d23fca11.1673769517.git.lukas@wunner.de Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: stable@vger.kernel.org	2023-02-09 12:46:15 -06:00
Pali Rohár	b3574f579e	PCI: mvebu: Mark driver as BROKEN People are reporting that pci-mvebu.c driver does not work with recent mainline kernel. There are more bugs which prevents its for daily usage. So lets mark it as broken for now, until somebody would be able to fix it in mainline kernel. Link: https://lore.kernel.org/r/20230114164125.1298-1-pali@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-02-09 10:01:30 +01:00
Dan Williams	5485eb9559	Merge branch 'for-6.3/cxl' into cxl/next Merge the general CXL updates with fixes targeting v6.2-rc for v6.3. Resolve a conflict with the fix and move of cxl_report_and_clear() from pci.c to core/pci.c.	2023-02-07 11:12:24 -08:00
Lukas Wunner	ac91e69805	PCI: Unify delay handling for reset and resume Sheng Bi reports that pci_bridge_secondary_bus_reset() may fail to wait for devices on the secondary bus to become accessible after reset: Although it does call pci_dev_wait(), it erroneously passes the bridge's pci_dev rather than that of a child. The bridge of course is always accessible while its secondary bus is reset, so pci_dev_wait() returns immediately. Sheng Bi proposes introducing a new pci_bridge_secondary_bus_wait() function which is called from pci_bridge_secondary_bus_reset(): https://lore.kernel.org/linux-pci/20220523171517.32407-1-windy.bi.enflame@gmail.com/ However we already have pci_bridge_wait_for_secondary_bus() which does almost exactly what we need. So far it's only called on resume from D3cold (which implies a Fundamental Reset per PCIe r6.0 sec 5.8). Re-using it for Secondary Bus Resets is a leaner and more rational approach than introducing a new function. That only requires a few minor tweaks: - Amend pci_bridge_wait_for_secondary_bus() to await accessibility of the first device on the secondary bus by calling pci_dev_wait() after performing the prescribed delays. pci_dev_wait() needs two parameters, a reset reason and a timeout, which callers must now pass to pci_bridge_wait_for_secondary_bus(). The timeout is 1 sec for resume (PCIe r6.0 sec 6.6.1) and 60 sec for reset (commit `821cdad5c4` ("PCI: Wait up to 60 seconds for device to become ready after FLR")). Introduce a PCI_RESET_WAIT macro for the 1 sec timeout. - Amend pci_bridge_wait_for_secondary_bus() to return 0 on success or -ENOTTY on error for consumption by pci_bridge_secondary_bus_reset(). - Drop an unnecessary 1 sec delay from pci_reset_secondary_bus() which is now performed by pci_bridge_wait_for_secondary_bus(). A static delay this long is only necessary for Conventional PCI, so modern PCIe systems benefit from shorter reset times as a side effect. Fixes: `6b2f1351af` ("PCI: Wait for device to become ready after secondary bus reset") Link: https://lore.kernel.org/r/da77c92796b99ec568bd070cbe4725074a117038.1673769517.git.lukas@wunner.de Reported-by: Sheng Bi <windy.bi.enflame@gmail.com> Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org # v4.17+	2023-02-07 11:54:03 -06:00
Lukas Wunner	8ef0217227	PCI/PM: Observe reset delay irrespective of bridge_d3 If a PCI bridge is suspended to D3cold upon entering system sleep, resuming it entails a Fundamental Reset per PCIe r6.0 sec 5.8. The delay prescribed after a Fundamental Reset in PCIe r6.0 sec 6.6.1 is sought to be observed by: pci_pm_resume_noirq() pci_pm_bridge_power_up_actions() pci_bridge_wait_for_secondary_bus() However, pci_bridge_wait_for_secondary_bus() bails out if the bridge_d3 flag is not set. That flag indicates whether a bridge is allowed to suspend to D3cold at runtime. Hence no delay is observed on resume from system sleep if runtime D3cold is forbidden. That doesn't make any sense, so drop the bridge_d3 check from pci_bridge_wait_for_secondary_bus(). The purpose of the bridge_d3 check was probably to avoid delays if a bridge remained in D0 during suspend. However the sole caller of pci_bridge_wait_for_secondary_bus(), pci_pm_bridge_power_up_actions(), is only invoked if the previous power state was D3cold. Hence the additional bridge_d3 check seems superfluous. Fixes: `ad9001f2f4` ("PCI/PM: Add missing link delays required by the PCIe spec") Link: https://lore.kernel.org/r/eb37fa345285ec8bacabbf06b020b803f77bdd3d.1673769517.git.lukas@wunner.de Tested-by: Ravi Kishore Koppuravuri <ravi.kishore.koppuravuri@intel.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: stable@vger.kernel.org # v5.5+	2023-02-07 11:54:03 -06:00
Mika Westerberg	7180c1d086	PCI: Distribute available resources for root buses, too Previously we distributed spare resources only upon hot-add, so if the initial root bus scan found devices that had not been fully configured by the BIOS, we allocated only enough resources to cover what was then present. If some of those devices were hotplug bridges, we did not leave any additional resource space for future expansion. Distribute the available resources for root buses, too, to make this work the same way as the normal hotplug case. A previous commit to do this was reverted due to a regression reported by Jonathan Cameron: `e96e27fc6f` ("PCI: Distribute available resources for root buses, too") `5632e2beaf` ("Revert "PCI: Distribute available resources for root buses, too"") This commit changes pci_bridge_resources_not_assigned() to work with bridges that do not have all the resource windows programmed by the boot firmware (previously we expected all I/O, memory and prefetchable memory were programmed). Link: https://bugzilla.kernel.org/show_bug.cgi?id=216000 Link: https://lore.kernel.org/r/20220905080232.36087-5-mika.westerberg@linux.intel.com Link: https://lore.kernel.org/r/20230131092405.29121-4-mika.westerberg@linux.intel.com Reported-by: Chris Chiu <chris.chiu@canonical.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-07 11:36:35 -06:00
Mika Westerberg	9db0b9b6a1	PCI: Take other bus devices into account when distributing resources A PCI bridge may reside on a bus with other devices as well. The resource distribution code does not take this into account and therefore it expands the bridge resource windows too much, not leaving space for the other devices (or functions of a multifunction device). This leads to an issue that Jonathan reported when running QEMU with the following topology (QEMU parameters): -device pcie-root-port,port=0,id=root_port13,chassis=0,slot=2 \ -device x3130-upstream,id=sw1,bus=root_port13,multifunction=on \ -device e1000,bus=root_port13,addr=0.1 \ -device xio3130-downstream,id=fun1,bus=sw1,chassis=0,slot=3 \ -device e1000,bus=fun1 The first e1000 NIC here is another function in the switch upstream port. This leads to following errors: pci 0000:00:04.0: bridge window [mem 0x10200000-0x103fffff] to [bus 02-04] pci 0000:02:00.0: bridge window [mem 0x10200000-0x103fffff] to [bus 03-04] pci 0000:02:00.1: BAR 0: failed to assign [mem size 0x00020000] e1000 0000:02:00.1: can't ioremap BAR 0: [??? 0x00000000 flags 0x0] Fix this by taking into account bridge windows, device BARs and SR-IOV PF BARs on the bus (PF BARs include space for VF BARS so only account PF BARs), including the ones belonging to bridges themselves if it has any. Link: https://lore.kernel.org/linux-pci/20221014124553.0000696f@huawei.com/ Link: https://lore.kernel.org/linux-pci/6053736d-1923-41e7-def9-7585ce1772d9@ixsystems.com/ Link: https://lore.kernel.org/r/20230131092405.29121-3-mika.westerberg@linux.intel.com Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reported-by: Alexander Motin <mav@ixsystems.com> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-07 11:04:25 -06:00
Mika Westerberg	08f0a15ee8	PCI: Align extra resources for hotplug bridges properly After division the extra resource space per hotplug bridge may not be aligned according to the window alignment, so align it before passing it down for further distribution. Link: https://lore.kernel.org/r/20230131092405.29121-2-mika.westerberg@linux.intel.com Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-07 10:54:40 -06:00
Sergio Paracuellos	0cb2a8f345	PCI: mt7621: Delay phy ports initialization Some devices like ZBT WE1326 and ZBT WF3526-P and some Netgear models need to delay phy port initialization after calling the mt7621_pcie_init_port() driver function to get into reliable boots for both warm and hard resets. The delay required to detect the ports seems to be in the range [75-100] milliseconds. If the ports are not detected the controller is not functional. There is no datasheet or something similar to really understand why this extra delay is needed only for these devices and it is not for most of the boards that are built on mt7621 SoC. This issue has been reported by openWRT community and the complete discussion is in [0]. The 100 milliseconds delay has been tested in all devices to validate it. Add the extra 100 milliseconds delay to fix the issue. [0]: https://github.com/openwrt/openwrt/pull/11220 Link: https://lore.kernel.org/r/20221231074041.264738-1-sergio.paracuellos@gmail.com Fixes: `2bdd5238e7` ("PCI: mt7621: Add MediaTek MT7621 PCIe host controller driver") Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-02-03 10:16:40 +01:00
Geert Uytterhoeven	a80becc56d	PCI: tegra: Convert to devm_of_phy_optional_get() Use the new devm_of_phy_optional_get() helper instead of open-coding the same operation. Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://lore.kernel.org/r/56508eeadf7fa8692877e872871f10294d48c49d.1674584626.git.geert+renesas@glider.be Signed-off-by: Vinod Koul <vkoul@kernel.org>	2023-02-03 11:19:35 +05:30
Paul E. McKenney	a8f0ff9185	drivers/pci/controller: Remove "select SRCU" Now that the SRCU Kconfig option is unconditionally selected, there is no longer any point in selecting it. Therefore, remove the "select SRCU" Kconfig statements. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Lorenzo Pieralisi <lpieralisi@kernel.org> Cc: Rob Herring <robh@kernel.org> Cc: "Krzysztof Wilczyński" <kw@linux.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: <linux-pci@vger.kernel.org> Acked-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: John Ogness <john.ogness@linutronix.de>	2023-02-02 16:26:06 -08:00
David E. Box	f492edb40b	PCI: vmd: Add quirk to configure PCIe ASPM and LTR PCIe ports reserved for VMD use are not visible to BIOS and therefore not configured to enable PCIe ASPM or LTR values (which BIOS will configure if they are not set). Lack of this programming results in high power consumption on laptops as reported in bugzilla. For affected products use pci_enable_link_state to set the allowed link states for devices on the root ports. Also set the LTR value to the maximum value needed for the SoC. This is a workaround for products from Rocket Lake through Alder Lake. Raptor Lake, the latest product at this time, has already implemented LTR configuring in BIOS. Future products will move ASPM configuration back to BIOS as well. As this solution is intended for laptops, support is not added for hotplug or for devices downstream of a switch on the root port. Link: https://bugzilla.kernel.org/show_bug.cgi?id=212355 Link: https://bugzilla.kernel.org/show_bug.cgi?id=215063 Link: https://bugzilla.kernel.org/show_bug.cgi?id=213717 Link: https://lore.kernel.org/r/20230120031522.2304439-5-david.e.box@linux.intel.com Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Jon Derrick <jonathan.derrick@linux.dev> Reviewed-by: Nirmal Patel <nirmal.patel@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>	2023-02-02 16:02:40 +01:00
David E. Box	14d2079af6	PCI: vmd: Create feature grouping for client products Simplify the device ID list by creating a grouping of features shared by client products. Suggested-by: Jon Derrick <jonathan.derrick@linux.dev> Link: https://lore.kernel.org/r/20230120031522.2304439-4-david.e.box@linux.intel.com Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>	2023-02-02 16:02:34 +01:00
David E. Box	cca0dfecdb	PCI: vmd: Use PCI_VDEVICE in device list Use PCI_VDEVICE to simplify the device table. Link: https://lore.kernel.org/r/20230120031522.2304439-3-david.e.box@linux.intel.com Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Jon Derrick <jonathan.derrick@linux.dev> Reviewed-by: Nirmal Patel <nirmal.patel@linux.intel.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>	2023-02-02 16:01:57 +01:00
Michael Bottini	de82f60f9c	PCI/ASPM: Add pci_enable_link_state() Add pci_enable_link_state() to allow devices to change the default BIOS configured states. Clears the BIOS default settings then sets the new states and reconfigures the link under the semaphore. Also add PCIE_LINK_STATE_ALL macro for convenience for callers that want to enable all link states. Link: https://lore.kernel.org/r/20230120031522.2304439-2-david.e.box@linux.intel.com Signed-off-by: Michael Bottini <michael.a.bottini@linux.intel.com> Signed-off-by: David E. Box <david.e.box@linux.intel.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-02 16:01:42 +01:00
Huacai Chen	8b3517f88f	PCI: loongson: Prevent LS7A MRRS increases Except for isochronous-configured devices, software may set Max_Read_Request_Size (MRRS) to any value up to 4096. If a device issues a read request with size greater than the completer's Max_Payload_Size (MPS), the completer is required to break the response into multiple completions. Instead of correctly responding with multiple completions to a large read request, some LS7A Root Ports respond with a Completer Abort. To prevent this, the MRRS must be limited to an implementation-specific value. The OS cannot detect that value, so rely on BIOS to configure MRRS before booting, and quirk the Root Ports so we never set an MRRS larger than that BIOS value for any downstream device. N.B. Hot-added devices are not configured by BIOS, and they power up with MRRS = 512 bytes, so these devices will be limited to 512 bytes. If the LS7A limit is smaller, those hot-added devices may not work correctly, but per [1], hotplug is not supported with this chipset revision. [1] https://lore.kernel.org/r/073638a7-ae68-2847-ac3d-29e5e760d6af@loongson.cn [bhelgaas: commit log] Link: https://bugzilla.kernel.org/show_bug.cgi?id=216884 Link: https://lore.kernel.org/r/20230201043018.778499-3-chenhuacai@loongson.cn Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-01 12:49:29 -06:00
Huacai Chen	62b6dee1b4	PCI/portdrv: Prevent LS7A Bus Master clearing on shutdown After `cc27b735ad` ("PCI/portdrv: Turn off PCIe services during shutdown") we observe hangs during poweroff/reboot on systems with LS7A chipset. This happens because the portdrv .shutdown() method (pcie_portdrv_remove()) clears PCI_COMMAND_MASTER via pci_disable_device(), which prevents bridges from forwarding memory or I/O Requests in the upstream direction (PCIe r6.0, sec 7.5.1.1.3). LS7A Root Ports have a hardware defect: clearing PCI_COMMAND_MASTER also prevents the bridge from forwarding CPU MMIO requests in the downstream direction, and these MMIO accesses to devices below the bridge happen even after .shutdown(), e.g., to print console messages. LS7A neither forwards the requests nor sends an unsuccessful completion to the CPU, so the CPU waits forever, resulting in the hang. The purpose of .shutdown() is to disable interrupts and DMA from the device. PCIe ports may generate interrupts (either MSI/MSI-X or INTx) for AER, DPC, PME, hotplug, etc., but they never perform DMA except MSI/MSI-X. Clearing PCI_COMMAND_MASTER effectively disables MSI/MSI-X, but not INTx. The port service driver .remove() methods clear the interrupt enables in PCI_ERR_ROOT_COMMAND, PCI_EXP_DPC_CTL, PCI_EXP_SLTCTL, and PCI_EXP_RTCTL, etc., which disables interrupts regardless of whether they are MSI/MSI-X or INTx. Add a pcie_portdrv_shutdown() method that calls all the port service driver .remove() methods to clear the interrupt enables for each service but does not clear Bus Mastering on the port itself. [bhelgaas: commit log] Link: https://lore.kernel.org/r/20230201043018.778499-2-chenhuacai@loongson.cn Signed-off-by: Huacai Chen <chenhuacai@loongson.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-02-01 12:05:28 -06:00
Damien Le Moal	63ba51db24	PCI: Avoid FLR for AMD FCH AHCI adapters PCI passthrough to VMs does not work with AMD FCH AHCI adapters: the guest OS fails to correctly probe devices attached to the controller due to FIS communication failures: ata4: softreset failed (1st FIS failed) ... ata4.00: qc timeout after 5000 msecs (cmd 0xec) ata4.00: failed to IDENTIFY (I/O error, err_mask=0x4) Forcing the "bus" reset method before unbinding & binding the adapter to the vfio-pci driver solves this issue, e.g.: echo "bus" > /sys/bus/pci/devices/<ID>/reset_method gives a working guest OS, indicating that the default FLR reset method doesn't work correctly. Apply quirk_no_flr() to AMD FCH AHCI devices to work around this issue. Link: https://lore.kernel.org/r/20230128013951.523247-1-damien.lemoal@opensource.wdc.com Reported-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org	2023-01-30 09:59:15 -06:00
Greg Kroah-Hartman	2a81ada32f	driver core: make struct bus_type.uevent() take a const * The uevent() callback in struct bus_type should not be modifying the device that is passed into it, so mark it as a const * and propagate the function signature changes out into all relevant subsystems that use this callback. Acked-by: Rafael J. Wysocki <rafael@kernel.org> Acked-by: Hans de Goede <hdegoede@redhat.com> Link: https://lore.kernel.org/r/20230111113018.459199-16-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-01-27 13:45:52 +01:00
Bjorn Helgaas	6b985af556	PCI/AER: Remove redundant Device Control Error Reporting Enable The following bits in the PCIe Device Control register enable sending of ERR_COR, ERR_NONFATAL, or ERR_FATAL Messages (or reporting internally in the case of Root Ports): Correctable Error Reporting Enable Non-Fatal Error Reporting Enable Fatal Error Reporting Enable Unsupported Request Reporting Enable These enable bits are set by pci_enable_pcie_error_reporting(), and since `f26e58bf6f` ("PCI/AER: Enable error reporting when AER is native"), we do that in this path during enumeration: pci_init_capabilities pci_aer_init pci_enable_pcie_error_reporting Previously, the AER service driver also traversed the hierarchy when claiming a Root Port, enabling error reporting for downstream devices, but this is redundant. Remove the code that enables this error reporting in the AER .probe() path. Also remove similar code that disables error reporting in the AER .remove() path. Note that these Device Control Reporting Enable bits do not control interrupt generation. That's done by the similarly-named bits in the AER Root Error Command register, which are still set by aer_probe() and cleared by aer_remove(), since the AER service driver handles those interrupts. See PCIe r6.0, sec 6.2.6. Link: https://lore.kernel.org/r/20230118234612.272916-2-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Stefan Roese <sr@denx.de> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Cc: Ashok Raj <ashok.raj@intel.com> Cc: Keith Busch <kbusch@kernel.org>	2023-01-26 17:06:13 -06:00
Yang Yingliang	fd858402c6	PCI: endpoint: pci-epf-vntb: Add epf_ntb_mw_bar_clear() num_mws kernel-doc `8e4bfbe644` ("PCI: endpoint: pci-epf-vntb: fix error handle in epf_ntb_mw_bar_init()") added a "num_mws" parameter to epf_ntb_mw_bar_clear() but failed to add kernel-doc for num_mws. Add kernel-doc for num_mws on epf_ntb_mw_bar_clear(). Fixes: `8e4bfbe644` ("PCI: endpoint: pci-epf-vntb: fix error handle in epf_ntb_mw_bar_init()") Link: https://lore.kernel.org/r/20230103024907.293853-1-yangyingliang@huawei.com Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-01-18 14:14:04 -06:00
Bjorn Helgaas	ddc10938e0	PCI: switchtec: Return -EFAULT for copy_to_user() errors switchtec_dev_read() didn't handle copy_to_user() errors correctly: it assigned "rc = -EFAULT", but actually returned either "size", -ENXIO, or -EBADMSG instead. Update the failure cases to unlock mrpc_mutex and return -EFAULT directly. Link: https://lore.kernel.org/r/20221216162126.207863-3-helgaas@kernel.org Fixes: `080b47def5` ("MicroSemi Switchtec management interface driver") Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>	2023-01-18 11:11:20 -06:00
Bjorn Helgaas	4e353ff40a	PCI: switchtec: Simplify switchtec_dma_mrpc_isr() The "ret" variable in switchtec_dma_mrpc_isr() is superfluous. Remove it and just return the value. No functional change intended. Link: https://lore.kernel.org/r/20221216162126.207863-2-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>	2023-01-18 11:11:14 -06:00
Alexey V. Vissarionov	ea0b5aa5f1	PCI/IOV: Enlarge virtfn sysfs name buffer The sysfs link name "virtfn%u" constructed by pci_iov_sysfs_link() requires 17 bytes to contain the longest possible string. Increase VIRTFN_ID_LEN to accommodate that. Found by Linux Verification Center (linuxtesting.org) with SVACE. [bhelgaas: commit log, comment at #define] Fixes: `dd7cc44d0b` ("PCI: add SR-IOV API for Physical Function driver") Link: https://lore.kernel.org/r/20221218033347.23743-1-gremlin@altlinux.org Signed-off-by: Alexey V. Vissarionov <gremlin@altlinux.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-01-18 10:54:41 -06:00
Dawei Li	96ec293962	Drivers: hv: Make remove callback of hyperv driver void returned Since commit `fc7a6209d5` ("bus: Make remove callback return void") forces bus_type::remove be void-returned, it doesn't make much sense for any bus based driver implementing remove callbalk to return non-void to its caller. As such, change the remove function for Hyper-V VMBus based drivers to return void. Signed-off-by: Dawei Li <set_pte_at@outlook.com> Link: https://lore.kernel.org/r/TYCP286MB2323A93C55526E4DF239D3ACCAFA9@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Wei Liu <wei.liu@kernel.org>	2023-01-17 13:41:27 +00:00
Richard Zhu	c435669a41	PCI: imx6: Add i.MX8MP PCIe EP support Add the i.MX8MP PCIe EP support. Link: https://lore.kernel.org/r/1673847684-31893-15-git-send-email-hongxing.zhu@nxp.com Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-01-16 10:41:59 +01:00
Richard Zhu	fb3217e2cf	PCI: imx6: Add i.MX8MM PCIe EP support Add i.MX8MM PCIe EP support. Link: https://lore.kernel.org/r/1673847684-31893-14-git-send-email-hongxing.zhu@nxp.com Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-01-16 10:41:59 +01:00
Richard Zhu	530ba41250	PCI: imx6: Add i.MX8MQ PCIe EP support Add i.MX8MQ PCIe EP support. Link: https://lore.kernel.org/r/1673847684-31893-13-git-send-email-hongxing.zhu@nxp.com Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-01-16 10:41:59 +01:00
Richard Zhu	75c2f26da0	PCI: imx6: Add i.MX PCIe EP mode support i.MX PCIe is one dual mode PCIe controller. Add i.MX PCIe EP mode support here, and split the PCIe modes to the Root Complex mode and Endpoint mode. Link: https://lore.kernel.org/r/1673847684-31893-12-git-send-email-hongxing.zhu@nxp.com Signed-off-by: Richard Zhu <hongxing.zhu@nxp.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org>	2023-01-16 10:41:59 +01:00
Linus Torvalds	9e058c2952	pci-v6.2-fixes-1 -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmPBniAUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vwOjRAAhjyRAgyiZV2rWS4pyvpQpqcpZWD9 796ZSqnzLJjVYCymGvUTX23FEA48n59+bCM/WpfEGUPrBf8LZQxC9YOCm6ltuM8+ FoSBykW/tHPq5IWaLzgrWpHeDOgEnZu/WFGGvrV3tl1mLpM1SJT8bGDsjHXlo+FM qkTEiA3nUEKQs5x9r2TTLCeUWGPNTIHNd2VfuxOqM3qC/nVCOfTTxU8nm6Lk7Eix nboAugAIADJIjs/+ZGekLBuzZYPkLYuDTyMYJ5hdo1p7wWCLc9gArEqvXKwVgmD3 ptenZeOlQi9Ay45HmkfIgfgKeeQ7REJj3dx04vf67neAianyUrB0EZDqDjR7LmgM ozlNt0XjyoeEhu6AQS0s1LZtbDiED1R/00P6Gb+YEjUCVipW2lEYYwP0v9dsnNoh 6wblgnkQoxLFM+5CAXRmCmpaoQn0Uam7okfVeohtsz8/kNQF2St0hjzr4Dmws+O3 k9PUqnnUl4ByElzpEDesVGZMJ3pxFVH15ufu8VnRqN60pLTvNrsPyU4cVnG176Rc 3RSDN3zMtPxnHJVy4r3bTNEZsX/7RUrOb4xScOXMmRDBMUc8QdscF8Oj1ucKlj5j mp7vB/7+VjU96uRarRyqUxGeQc77DCTcvOa1IGh/cuYom8ZJ6vpSCpKy6f6SFGuf i8iTTUcQKCdqVW4= =Fv2v -----END PGP SIGNATURE----- Merge tag 'pci-v6.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull pci fixes from Bjorn Helgaas: - Work around apparent firmware issue that made Linux reject MMCONFIG space, which broke PCI extended config space (Bjorn Helgaas) - Fix CONFIG_PCIE_BT1 dependency due to mid-air collision between a PCI_MSI_IRQ_DOMAIN -> PCI_MSI change and addition of PCIE_BT1 (Lukas Bulwahn) * tag 'pci-v6.2-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: x86/pci: Treat EfiMemoryMappedIO as reservation of ECAM space x86/pci: Simplify is_mmconf_reserved() messages PCI: dwc: Adjust to recent removal of PCI_MSI_IRQ_DOMAIN	2023-01-13 17:32:22 -06:00
Rafael J. Wysocki	8133844a8f	PCI/ACPI: Account for _S0W of the target bridge in acpi_pci_bridge_d3() It is questionable to allow a PCI bridge to go into D3 if it has _S0W returning D2 or a shallower power state, so modify acpi_pci_bridge_d3(() to always take the return value of _S0W for the target bridge into account. That is, make it return 'false' if _S0W returns D2 or a shallower power state for the target bridge regardless of its ancestor Root Port properties. Of course, this also causes 'false' to be returned if the Root Port itself is the target and its _S0W returns D2 or a shallower power state. However, still allow bridges without _S0W that are power-manageable via ACPI to enter D3 to retain the current code behavior in that case. This fixes problems where a hotplug notification is missed because a bridge is in D3. That means hot-added devices such as USB4 docks (and the devices they contain) and Thunderbolt 3 devices may not work. Link: https://lore.kernel.org/linux-pci/20221031223356.32570-1-mario.limonciello@amd.com/ Link: https://lore.kernel.org/r/12155458.O9o76ZdvQC@kreacher Reported-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-01-13 15:56:10 -06:00
Linus Torvalds	bad8c4a850	xen: branch for v6.2-rc4 -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCY76ohgAKCRCAXGG7T9hj vo8fAP0XJ94B7asqcN4W3EyeyfqxUf1eZvmWRhrbKqpLnmHLaQEA/uJBkXL49Zj7 TTcbxR1coJ/hPwhtmONU4TNtCZ+RXw0= =2Ib5 -----END PGP SIGNATURE----- Merge tag 'for-linus-6.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - two cleanup patches - a fix of a memory leak in the Xen pvfront driver - a fix of a locking issue in the Xen hypervisor console driver * tag 'for-linus-6.2-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/pvcalls: free active map buffer on pvcalls_front_free_map hvc/xen: lock console list traversal x86/xen: Remove the unused function p2m_index() xen: make remove callback of xen driver void returned	2023-01-12 17:02:20 -06:00
Vidya Sagar	bba5065963	PCI/AER: Configure ECRC only if AER is native As the ECRC configuration bits are part of AER registers, configure ECRC only if AER is natively owned by the kernel. Link: https://lore.kernel.org/r/20230112072111.20063-1-vidyas@nvidia.com Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2023-01-12 12:24:37 -06:00
Ira Weiny	589c335737	PCI/CXL: Export native CXL error reporting control CXL _OSC Error Reporting Control is used by the OS to determine if Firmware has control of various CXL error reporting capabilities including the event logs. Expose the result of negotiating CXL Error Reporting Control in struct pci_host_bridge for consumption by the CXL drivers. Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Lukas Wunner <lukas@wunner.de> Cc: linux-pci@vger.kernel.org Cc: linux-acpi@vger.kernel.org Signed-off-by: Ira Weiny <ira.weiny@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20221212070627.1372402-2-ira.weiny@intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-01-05 13:31:27 -08:00
Lukas Bulwahn	760d560f71	PCI: dwc: Adjust to recent removal of PCI_MSI_IRQ_DOMAIN `a474d3fbe2` ("PCI/MSI: Get rid of PCI_MSI_IRQ_DOMAIN") removed PCI_MSI_IRQ_DOMAIN and changed all references to refer to PCI_MSI instead. `ba6ed462dc` ("PCI: dwc: Add Baikal-T1 PCIe controller support") independently added PCIE_BT1, depending on PCI_MSI_IRQ_DOMAIN. Both commits appeared in v6.2-rc1, so the latter missed the conversion from PCI_MSI_IRQ_DOMAIN to PCI_MSI. Update PCIE_BT1 to depend on PCI_MSI instead. [bhelgaas: commit log] Link: https://lore.kernel.org/r/20221215103452.23131-1-lukas.bulwahn@gmail.com Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Serge Semin <fancer.lancer@gmail.com>	2023-01-04 06:06:52 -06:00
Linus Torvalds	e79041113b	phy-for-6.2 - New support: - Allwinner H616 USB PHY and A100 DPHY support - TI J721s2, J784s4 and J721e support - Freescale i.MX8MP PCIe PHY support - New driver for Renesas Ethernet SERDES supporting R-Car S4-8 - Qualcomm SM8450 PCIe1 PHY support in EP mode - Updates: - again a big pile of updates on qcom-qmp-* drivers following the driver split and reorganization merged earlier - Phy order of API calls documentation update -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE+vs47OPLdNbVcHzyfBQHDyUjg0cFAmOfIbYACgkQfBQHDyUj g0fSbw//Rgfk+owGLWyJ3PxRXiDhZaJdBUQNuZEe46TjGKKHvWLJ4+ig6vrXlPgr 8mVte7jEMZubO7YE/1Vifv9xiFmjo+5R4//WlfkIwy/0SFR8+N+DPQiGU7i7ecov uzkFN26qsi4aQrKmxyadGJQzHipaLViBkr6fqfuFcmyDiFII0FoVa/mV7ZQlFtl3 cDv3leFnp3HQ9mr/mKhOSmbyWCEQHqQvjDwB50R915WfH9PLV2jYddfO4Cbwpr4r 7m7wX2EiFlQ1o2gwcFQdLiDkA8YL9Kw3wOChpbcCu4gOapJ+GWqCk0AqS9m8MMWF HnyAyHw3NxDagwV6sN19Xxa7XgkPJZPn6/92BfGYeD6H5gxmYwdROeU2/x6Qt1+z scTl1m6z8X9WWwjnWK1cqVqBPUXoJJ2smym6VBHh3f4AJAVmwZy+yyk1Oar5qa2M yDWV7nIRJQmXnuQ+XsG5rmXmmMwOuBgng4NsNX9PjhdVy6/1FUOJuMCr8ldPLAkG Lpg+GN8w6tn2G0bxrHzWeAOytxjK5XuXch99BHmXDl+NgIpp/6DuyddXmvG4nrvk R6eDv86UOQgGP2h7SujUm9f6RIWb3nJrYN27r+IHK/z5LjSMfylSSu13GvMjZkt4 Et5Q4Wk9MomHFQkhiTGTd9WlSvb497RgzKhBhMg/lJoSyTi9Eew= =4HRP -----END PGP SIGNATURE----- Merge tag 'phy-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy Pull phy updates from Vinod Koul: "This tme we have again a big pile of qcom-qmp-* changes, one new driver and bunch of new hardware support. New hardware support: - Allwinner H616 USB PHY and A100 DPHY support - TI J721s2, J784s4 and J721e support - Freescale i.MX8MP PCIe PHY support - New driver for Renesas Ethernet SERDES supporting R-Car S4-8 - Qualcomm SM8450 PCIe1 PHY support in EP mode - Qualcomm SC8280XP PCIe PHY support (including x4 mode) - Fixed Qualcomm SC8280XP USB4-USB3-DP PHY DT bindings Updates: - A big pile of updates on qcom-qmp-* drivers following the driver split and reorganization merged earlier - Phy order of API calls documentation update" * tag 'phy-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/phy/linux-phy: (174 commits) phy: ti: phy-j721e-wiz: add j721s2-wiz-10g module support dt-bindings: phy-j721e-wiz: add j721s2 compatible string phy: use devm_platform_get_and_ioremap_resource() phy: allwinner: phy-sun6i-mipi-dphy: Add the A100 DPHY variant phy: allwinner: phy-sun6i-mipi-dphy: Add a variant power-on hook phy: allwinner: phy-sun6i-mipi-dphy: Set the enable bit last phy: allwinner: phy-sun6i-mipi-dphy: Make RX support optional dt-bindings: sun6i-a31-mipi-dphy: Add the A100 DPHY variant dt-bindings: sun6i-a31-mipi-dphy: Add the interrupts property phy: qcom-qmp-pcie: drop redundant clock allocation phy: qcom-qmp-usb: drop redundant clock allocation phy: qcom-qmp: drop unused type header phy: qcom-qmp-usb: drop sc8280xp reference-clock source dt-bindings: phy: qcom,sc8280xp-qmp-usb3-uni: drop reference-clock source phy: qcom-qmp-combo: add support for updated sc8280xp binding phy: qcom-qmp-combo: rename DP_PHY register pointer phy: qcom-qmp-combo: rename common-register pointers phy: qcom-qmp-combo: clean up DP clock callbacks phy: qcom-qmp-combo: separate clock and provider registration phy: qcom-qmp-combo: add clock registration helper ...	2022-12-19 08:40:58 -06:00
Dawei Li	7cffcade57	xen: make remove callback of xen driver void returned Since commit `fc7a6209d5` ("bus: Make remove callback return void") forces bus_type::remove be void-returned, it doesn't make much sense for any bus based driver implementing remove callbalk to return non-void to its caller. This change is for xen bus based drivers. Acked-by: Juergen Gross <jgross@suse.com> Signed-off-by: Dawei Li <set_pte_at@outlook.com> Link: https://lore.kernel.org/r/TYCP286MB23238119AB4DF190997075C9CAE39@TYCP286MB2323.JPNP286.PROD.OUTLOOK.COM Signed-off-by: Juergen Gross <jgross@suse.com>	2022-12-15 16:06:10 +01:00
Linus Torvalds	c7020e1b34	pci-v6.2-changes -----BEGIN PGP SIGNATURE----- iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmOYpTIUHGJoZWxnYWFz QGdvb2dsZS5jb20ACgkQWYigwDrT+vxuZhAAhGjE8voLZeOYwxbvfL69hGTAZ+Me x2hqRWVhh/IGWXTTaoSLwSjMMokcmAKN5S/wv8qdCG5sB8EN8FyTBIZDy8PuRRdl 8UlqlBMSL+d4oSRDCnYLxFNcynLRNnmx2dfcdw9tJ4zjTLN8Y4o8PHFogR6pJ3MT sDC8S0myTQKXr4wAGzTZycKsiGManviYtByp6dCcKD3Oy5Q2uZ9OKO2DP2yQpn+F c3IJSV9oDz3KR8JVJ5Q1iz9cdMXbGwjkM3JLlHpxhedwjN4ErLumPutKcebtzO5C aTqabN7Nnzc4yJusAIfojFCWH7fgaYUyJ3pxcFyJ4tu4m9Last+2I5UB/kV2sYAD jWiCYx3sA/mRopNXOnrBGae+Lgy+sQnt8or0grySr0bK+b+ArAGis4uT4A0uASGO RUQdIQwz7zhHeQrwAladHWxnx4BEDNCatgfn38p4fklIYKydCY5nfZURMDvHezSR G6Nu08hoE9ZXlmkWTFw+5F23wPWKcCpzZj0hf7OroIouXUp8vqSFSqatH5vGkbCl bDswck9GdRJ2hl5SvFOeelaXkM42du45TMLU2JmIn6dYYFNrO93JgdvKSU7E2CpG AmDIpg1Idxo8fEPPGH1I7RVU5+ilzmmPQQY7poQW+va4/dEd/QVp1+ZZTDnMC1qk qi3ck22VdvPU2VU= =KULr -----END PGP SIGNATURE----- Merge tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci Pull PCI updates from Bjorn Helgaas: "Enumeration: - Squash portdrv_{core,pci}.c into portdrv.c to ease maintenance and make more things static. - Make portdrv bind to Switch Ports that have AER. Previously, if these Ports lacked MSI/MSI-X, portdrv failed to bind, which meant the Ports couldn't be suspended to low-power states. AER on these Ports doesn't use interrupts, and the AER driver doesn't need to claim them. - Assign PCI domain IDs using ida_alloc(), which makes host bridge add/remove work better. Resource management: - To work better with recent BIOSes that use EfiMemoryMappedIO for PCI host bridge apertures, remove those regions from the E820 map (E820 entries normally prevent us from allocating BARs). In v5.19, we added some quirks to disable E820 checking, but that's not very maintainable. EfiMemoryMappedIO means the OS needs to map the region for use by EFI runtime services; it shouldn't prevent OS from using it. PCIe native device hotplug: - Build pciehp by default if USB4 is enabled, since Thunderbolt/USB4 PCIe tunneling depends on native PCIe hotplug. - Enable Command Completed Interrupt only if supported to avoid user confusion from lspci output that says this is enabled but not supported. - Prevent pciehp from binding to Switch Upstream Ports; this happened because of interaction with acpiphp and caused devices below the Upstream Port to disappear. Power management: - Convert AGP drivers to generic power management. We hope to remove legacy power management from the PCI core eventually. Virtualization: - Fix pci_device_is_present(), which previously always returned "false" for VFs, causing virtio hangs when unbinding the driver. Miscellaneous: - Convert drivers to gpiod API to prepare for dropping some legacy code. - Fix DOE fencepost error for the maximum data object length. Baikal-T1 PCIe controller driver: - Add driver and DT bindings. Broadcom STB PCIe controller driver: - Enable Multi-MSI. - Delay 100ms after PERST# deassert to allow power and clocks to stabilize. - Configure Read Completion Boundary to 64 bytes. Freescale i.MX6 PCIe controller driver: - Initialize PHY before deasserting core reset to fix a regression in v6.0 on boards where the PHY provides the reference. - Fix imx6sx and imx8mq clock names in DT schema. Intel VMD host bridge driver: - Fix Secondary Bus Reset on VMD bridges, which allows reset of NVMe SSDs in VT-d pass-through scenarios. - Disable MSI remapping, which gets re-enabled by firmware during suspend/resume. MediaTek PCIe Gen3 controller driver: - Add MT7986 and MT8195 support. Qualcomm PCIe controller driver: - Add SC8280XP/SA8540P basic interconnect support. Rockchip DesignWare PCIe controller driver: - Base DT schema on common Synopsys schema. Synopsys DesignWare PCIe core: - Collect DT items shared between Root Port and Endpoint (PERST GPIO, PHY info, clocks, resets, link speed, number of lanes, number of iATU windows, interrupt info, etc) to snps,dw-pcie-common.yaml. - Add dma-ranges support for Root Ports and Endpoints. - Consolidate DT resource retrieval for "dbi", "dbi2", "atu", etc. to reduce code duplication. - Add generic names for clocks and resets to encourage more consistent naming across drivers using DesignWare IP. - Stop advertising PTM Responder role for Endpoints, which aren't allowed to be responders. TI J721E PCIe driver: - Add j721s2 host mode ID to DT schema. - Add interrupt properties to DT schema. Toshiba Visconti PCIe controller driver: - Fix interrupts array max constraints in DT schema" * tag 'pci-v6.2-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci: (95 commits) x86/PCI: Use pr_info() when possible x86/PCI: Fix log message typo x86/PCI: Tidy E820 removal messages PCI: Skip allocate_resource() if too little space available efi/x86: Remove EfiMemoryMappedIO from E820 map PCI/portdrv: Allow AER service only for Root Ports & RCECs PCI: xilinx-nwl: Fix coding style violations PCI: mvebu: Switch to using gpiod API PCI: pciehp: Enable Command Completed Interrupt only if supported PCI: aardvark: Switch to using devm_gpiod_get_optional() dt-bindings: PCI: mediatek-gen3: add support for mt7986 dt-bindings: PCI: mediatek-gen3: add SoC based clock config dt-bindings: PCI: qcom: Allow 'dma-coherent' property PCI: mt7621: Add sentinel to quirks table PCI: vmd: Fix secondary bus reset for Intel bridges PCI: endpoint: pci-epf-vntb: Fix sparse ntb->reg build warning PCI: endpoint: pci-epf-vntb: Fix sparse build warning for epf_db PCI: endpoint: pci-epf-vntb: Replace hardcoded 4 with sizeof(u32) PCI: endpoint: pci-epf-vntb: Remove unused epf_db_phy struct member PCI: endpoint: pci-epf-vntb: Fix call pci_epc_mem_free_addr() in error path ...	2022-12-14 09:54:10 -08:00
Linus Torvalds	08cdc21579	iommufd for 6.2 iommufd is the user API to control the IOMMU subsystem as it relates to managing IO page tables that point at user space memory. It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO container) which is the VFIO specific interface for a similar idea. We see a broad need for extended features, some being highly IOMMU device specific: - Binding iommu_domain's to PASID/SSID - Userspace IO page tables, for ARM, x86 and S390 - Kernel bypassed invalidation of user page tables - Re-use of the KVM page table in the IOMMU - Dirty page tracking in the IOMMU - Runtime Increase/Decrease of IOPTE size - PRI support with faults resolved in userspace Many of these HW features exist to support VM use cases - for instance the combination of PASID, PRI and Userspace IO Page Tables allows an implementation of DMA Shared Virtual Addressing (vSVA) within a guest. Dirty tracking enables VM live migration with SRIOV devices and PASID support allow creating "scalable IOV" devices, among other things. As these features are fundamental to a VM platform they need to be uniformly exposed to all the driver families that do DMA into VMs, which is currently VFIO and VDPA. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCY5ct7wAKCRCFwuHvBreF YZZ5AQDciXfcgXLt0UBEmWupNb0f/asT6tk717pdsKm8kAZMNAEAsIyLiKT5HqGl s7fAu+CQ1pr9+9NKGevD+frw8Solsw4= =jJkd -----END PGP SIGNATURE----- Merge tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd implementation from Jason Gunthorpe: "iommufd is the user API to control the IOMMU subsystem as it relates to managing IO page tables that point at user space memory. It takes over from drivers/vfio/vfio_iommu_type1.c (aka the VFIO container) which is the VFIO specific interface for a similar idea. We see a broad need for extended features, some being highly IOMMU device specific: - Binding iommu_domain's to PASID/SSID - Userspace IO page tables, for ARM, x86 and S390 - Kernel bypassed invalidation of user page tables - Re-use of the KVM page table in the IOMMU - Dirty page tracking in the IOMMU - Runtime Increase/Decrease of IOPTE size - PRI support with faults resolved in userspace Many of these HW features exist to support VM use cases - for instance the combination of PASID, PRI and Userspace IO Page Tables allows an implementation of DMA Shared Virtual Addressing (vSVA) within a guest. Dirty tracking enables VM live migration with SRIOV devices and PASID support allow creating "scalable IOV" devices, among other things. As these features are fundamental to a VM platform they need to be uniformly exposed to all the driver families that do DMA into VMs, which is currently VFIO and VDPA" For more background, see the extended explanations in Jason's pull request: https://lore.kernel.org/lkml/Y5dzTU8dlmXTbzoJ@nvidia.com/ * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: (62 commits) iommufd: Change the order of MSI setup iommufd: Improve a few unclear bits of code iommufd: Fix comment typos vfio: Move vfio group specific code into group.c vfio: Refactor dma APIs for emulated devices vfio: Wrap vfio group module init/clean code into helpers vfio: Refactor vfio_device open and close vfio: Make vfio_device_open() truly device specific vfio: Swap order of vfio_device_container_register() and open_device() vfio: Set device->group in helper function vfio: Create wrappers for group register/unregister vfio: Move the sanity check of the group to vfio_create_group() vfio: Simplify vfio_create_group() iommufd: Allow iommufd to supply /dev/vfio/vfio vfio: Make vfio_container optionally compiled vfio: Move container related MODULE_ALIAS statements into container.c vfio-iommufd: Support iommufd for emulated VFIO devices vfio-iommufd: Support iommufd for physical VFIO devices vfio-iommufd: Allow iommufd to be used in place of a container fd vfio: Use IOMMU_CAP_ENFORCE_CACHE_COHERENCY for vfio_file_enforced_coherent() ...	2022-12-14 09:15:43 -08:00
Linus Torvalds	ce8a79d560	for-6.2/block-2022-12-08 -----BEGIN PGP SIGNATURE----- iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOScsgQHGF4Ym9lQGtl cm5lbC5kawAKCRD301j7KXHgpi5ID/9pLXFYOq1+uDjU0KO/MdjMjK8Ukr34lCnk WkajRLheE8JBKOFDE54XJk56sQSZHX9bTWqziar0h1fioh7FlQR/tVvzsERCm2M9 2y9THJNJygC68wgybStyiKlshFjl7TD7Kv5N9Y3xP3mkQygT+D6o8fXZk5xQbYyH YdFSoq4rJVHxRL03yzQiReGGIYdOUEQQh8l1FiLwLlKa3lXAey1KuxWIzksVN0KK aZB4QhiBpOiPgDHUVisq2XtyQjpZ2byoCImPzgrcqk9Jo4esvm/e6esrg4xlsvII LKFFkTmbVqjUZtFjqakFHmfuzVor4nU5f+xb90ZHExuuODYckkxWp5rWhf9QwqqI 0ik6WYgI1/5vnHnX8f2DYzOFQf9qa/rLgg0CshyUODlD6RfHa9vntqYvlIFkmOBd Q7KblIoK8YTzUS1M+v7X8JQ7gDR2KwygH37Da2KJS+vgvfIb8kJGr1ZORuhJuJJ7 Bl69gaNkHTHrqufp7UI64YXfueeuNu2J9z3zwzGoxeaFaofF/phDn0/2gCQE1fQI XBhsMw+ETqI6B2SPHMnzYDu2DM1S8ZTOYQlaD4G3uqgWnAM1tG707395uAy5yu4n D5azU1fVG4UocoNIyPujpaoSRs2zWZycEFEeUQkhyDDww/j4hlHi6H33eOnk0zsr wxzFGfvHfw== =k/vv -----END PGP SIGNATURE----- Merge tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux Pull block updates from Jens Axboe: - NVMe pull requests via Christoph: - Support some passthrough commands without CAP_SYS_ADMIN (Kanchan Joshi) - Refactor PCIe probing and reset (Christoph Hellwig) - Various fabrics authentication fixes and improvements (Sagi Grimberg) - Avoid fallback to sequential scan due to transient issues (Uday Shankar) - Implement support for the DEAC bit in Write Zeroes (Christoph Hellwig) - Allow overriding the IEEE OUI and firmware revision in configfs for nvmet (Aleksandr Miloserdov) - Force reconnect when number of queue changes in nvmet (Daniel Wagner) - Minor fixes and improvements (Uros Bizjak, Joel Granados, Sagi Grimberg, Christoph Hellwig, Christophe JAILLET) - Fix and cleanup nvme-fc req allocation (Chaitanya Kulkarni) - Use the common tagset helpers in nvme-pci driver (Christoph Hellwig) - Cleanup the nvme-pci removal path (Christoph Hellwig) - Use kstrtobool() instead of strtobool (Christophe JAILLET) - Allow unprivileged passthrough of Identify Controller (Joel Granados) - Support io stats on the mpath device (Sagi Grimberg) - Minor nvmet cleanup (Sagi Grimberg) - MD pull requests via Song: - Code cleanups (Christoph) - Various fixes - Floppy pull request from Denis: - Fix a memory leak in the init error path (Yuan) - Series fixing some batch wakeup issues with sbitmap (Gabriel) - Removal of the pktcdvd driver that was deprecated more than 5 years ago, and subsequent removal of the devnode callback in struct block_device_operations as no users are now left (Greg) - Fix for partition read on an exclusively opened bdev (Jan) - Series of elevator API cleanups (Jinlong, Christoph) - Series of fixes and cleanups for blk-iocost (Kemeng) - Series of fixes and cleanups for blk-throttle (Kemeng) - Series adding concurrent support for sync queues in BFQ (Yu) - Series bringing drbd a bit closer to the out-of-tree maintained version (Christian, Joel, Lars, Philipp) - Misc drbd fixes (Wang) - blk-wbt fixes and tweaks for enable/disable (Yu) - Fixes for mq-deadline for zoned devices (Damien) - Add support for read-only and offline zones for null_blk (Shin'ichiro) - Series fixing the delayed holder tracking, as used by DM (Yu, Christoph) - Series enabling bio alloc caching for IRQ based IO (Pavel) - Series enabling userspace peer-to-peer DMA (Logan) - BFQ waker fixes (Khazhismel) - Series fixing elevator refcount issues (Christoph, Jinlong) - Series cleaning up references around queue destruction (Christoph) - Series doing quiesce by tagset, enabling cleanups in drivers (Christoph, Chao) - Series untangling the queue kobject and queue references (Christoph) - Misc fixes and cleanups (Bart, David, Dawei, Jinlong, Kemeng, Ye, Yang, Waiman, Shin'ichiro, Randy, Pankaj, Christoph) * tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux: (247 commits) blktrace: Fix output non-blktrace event when blk_classic option enabled block: sed-opal: Don't include <linux/kernel.h> sed-opal: allow using IOC_OPAL_SAVE for locking too blk-cgroup: Fix typo in comment block: remove bio_set_op_attrs nvmet: don't open-code NVME_NS_ATTR_RO enumeration nvme-pci: use the tagset alloc/free helpers nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers nvme: consolidate setting the tagset flags nvme: pass nr_maps explicitly to nvme_alloc_io_tag_set block: bio_copy_data_iter nvme-pci: split out a nvme_pci_ctrl_is_dead helper nvme-pci: return early on ctrl state mismatch in nvme_reset_work nvme-pci: rename nvme_disable_io_queues nvme-pci: cleanup nvme_suspend_queue nvme-pci: remove nvme_pci_disable nvme-pci: remove nvme_disable_admin_queue nvme: merge nvme_shutdown_ctrl into nvme_disable_ctrl nvme: use nvme_wait_ready in nvme_shutdown_ctrl ...	2022-12-13 10:43:59 -08:00
Linus Torvalds	268325bda5	Random number generator updates for Linux 6.2-rc1. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmOU+U8ACgkQSfxwEqXe A67NnQ//Y5DltmvibyPd7r1TFT2gUYv+Rx3sUV9ZE1NYptd/SWhhcL8c5FZ70Fuw bSKCa1uiWjOxosjXT1kGrWq3de7q7oUpAPSOGxgxzoaNURIt58N/ajItCX/4Au8I RlGAScHy5e5t41/26a498kB6qJ441fBEqCYKQpPLINMBAhe8TQ+NVp0rlpUwNHFX WrUGg4oKWxdBIW3HkDirQjJWDkkAiklRTifQh/Al4b6QDbOnRUGGCeckNOhixsvS waHWTld+Td8jRrA4b82tUb2uVZ2/b8dEvj/A8CuTv4yC0lywoyMgBWmJAGOC+UmT ZVNdGW02Jc2T+Iap8ZdsEmeLHNqbli4+IcbY5xNlov+tHJ2oz41H9TZoYKbudlr6 /ReAUPSn7i50PhbQlEruj3eg+M2gjOeh8OF8UKwwRK8PghvyWQ1ScW0l3kUhPIhI PdIG6j4+D2mJc1FIj2rTVB+Bg933x6S+qx4zDxGlNp62AARUFYf6EgyD6aXFQVuX RxcKb6cjRuFkzFiKc8zkqg5edZH+IJcPNuIBmABqTGBOxbZWURXzIQvK/iULqZa4 CdGAFIs6FuOh8pFHLI3R4YoHBopbHup/xKDEeAO9KZGyeVIuOSERDxxo5f/ITzcq APvT77DFOEuyvanr8RMqqh0yUjzcddXqw9+ieufsAyDwjD9DTuE= =QRhK -----END PGP SIGNATURE----- Merge tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random Pull random number generator updates from Jason Donenfeld: - Replace prandom_u32_max() and various open-coded variants of it, there is now a new family of functions that uses fast rejection sampling to choose properly uniformly random numbers within an interval: get_random_u32_below(ceil) - [0, ceil) get_random_u32_above(floor) - (floor, U32_MAX] get_random_u32_inclusive(floor, ceil) - [floor, ceil] Coccinelle was used to convert all current users of prandom_u32_max(), as well as many open-coded patterns, resulting in improvements throughout the tree. I'll have a "late" 6.1-rc1 pull for you that removes the now unused prandom_u32_max() function, just in case any other trees add a new use case of it that needs to converted. According to linux-next, there may be two trivial cases of prandom_u32_max() reintroductions that are fixable with a 's/.../.../'. So I'll have for you a final conversion patch doing that alongside the removal patch during the second week. This is a treewide change that touches many files throughout. - More consistent use of get_random_canary(). - Updates to comments, documentation, tests, headers, and simplification in configuration. - The arch_get_random_early() abstraction was only used by arm64 and wasn't entirely useful, so this has been replaced by code that works in all relevant contexts. - The kernel will use and manage random seeds in non-volatile EFI variables, refreshing a variable with a fresh seed when the RNG is initialized. The RNG GUID namespace is then hidden from efivarfs to prevent accidental leakage. These changes are split into random.c infrastructure code used in the EFI subsystem, in this pull request, and related support inside of EFISTUB, in Ard's EFI tree. These are co-dependent for full functionality, but the order of merging doesn't matter. - Part of the infrastructure added for the EFI support is also used for an improvement to the way vsprintf initializes its siphash key, replacing an sleep loop wart. - The hardware RNG framework now always calls its correct random.c input function, add_hwgenerator_randomness(), rather than sometimes going through helpers better suited for other cases. - The add_latent_entropy() function has long been called from the fork handler, but is a no-op when the latent entropy gcc plugin isn't used, which is fine for the purposes of latent entropy. But it was missing out on the cycle counter that was also being mixed in beside the latent entropy variable. So now, if the latent entropy gcc plugin isn't enabled, add_latent_entropy() will expand to a call to add_device_randomness(NULL, 0), which adds a cycle counter, without the absent latent entropy variable. - The RNG is now reseeded from a delayed worker, rather than on demand when used. Always running from a worker allows it to make use of the CPU RNG on platforms like S390x, whose instructions are too slow to do so from interrupts. It also has the effect of adding in new inputs more frequently with more regularity, amounting to a long term transcript of random values. Plus, it helps a bit with the upcoming vDSO implementation (which isn't yet ready for 6.2). - The jitter entropy algorithm now tries to execute on many different CPUs, round-robining, in hopes of hitting even more memory latencies and other unpredictable effects. It also will mix in a cycle counter when the entropy timer fires, in addition to being mixed in from the main loop, to account more explicitly for fluctuations in that timer firing. And the state it touches is now kept within the same cache line, so that it's assured that the different execution contexts will cause latencies. tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits) random: include <linux/once.h> in the right header random: align entropy_timer_state to cache line random: mix in cycle counter when jitter timer fires random: spread out jitter callback to different CPUs random: remove extraneous period and add a missing one in comments efi: random: refresh non-volatile random seed when RNG is initialized vsprintf: initialize siphash key using notifier random: add back async readiness notifier random: reseed in delayed work rather than on-demand random: always mix cycle counter in add_latent_entropy() hw_random: use add_hwgenerator_randomness() for early entropy random: modernize documentation comment on get_random_bytes() random: adjust comment to account for removed function random: remove early archrandom abstraction random: use random.trust_{bootloader,cpu} command line option only stackprotector: actually use get_random_canary() stackprotector: move get_random_canary() into stackprotector.h treewide: use get_random_u32_inclusive() when possible treewide: use get_random_u32_{above,below}() instead of manual loop treewide: use get_random_u32_below() instead of deprecated function ...	2022-12-12 16:22:22 -08:00
Linus Torvalds	c1f0fcd85d	cxl for 6.2 - Add the cpu_cache_invalidate_memregion() API for cache flushing in response to physical memory reconfiguration, or memory-side data invalidation from operations like secure erase or memory-device unlock. - Add a facility for the kernel to warn about collisions between kernel and userspace access to PCI configuration registers - Add support for Restricted CXL Host (RCH) topologies (formerly CXL 1.1) - Add handling and reporting of CXL errors reported via the PCIe AER mechanism - Add support for CXL Persistent Memory Security commands - Add support for the "XOR" algorithm for CXL host bridge interleave - Rework / simplify CXL to NVDIMM interactions - Miscellaneous cleanups and fixes -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCY5UpyAAKCRDfioYZHlFs Z0ttAP4uxCjIibKsFVyexpSgI4vaZqQ9yt9NesmPwonc0XookwD+PlwP6Xc0d0Ox t0gJ6+pwdh11NRzhcNE1pAaPcJZU4gs= =HAQk -----END PGP SIGNATURE----- Merge tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull cxl updates from Dan Williams: "Compute Express Link (CXL) updates for 6.2. While it may seem backwards, the CXL update this time around includes some focus on CXL 1.x enabling where the work to date had been with CXL 2.0 (VH topologies) in mind. First generation CXL can mostly be supported via BIOS, similar to DDR, however it became clear there are use cases for OS native CXL error handling and some CXL 3.0 endpoint features can be deployed on CXL 1.x hosts (Restricted CXL Host (RCH) topologies). So, this update brings RCH topologies into the Linux CXL device model. In support of the ongoing CXL 2.0+ enabling two new core kernel facilities are added. One is the ability for the kernel to flag collisions between userspace access to PCI configuration registers and kernel accesses. This is brought on by the PCIe Data-Object-Exchange (DOE) facility, a hardware mailbox over config-cycles. The other is a cpu_cache_invalidate_memregion() API that maps to wbinvd_on_all_cpus() on x86. To prevent abuse it is disabled in guest VMs and architectures that do not support it yet. The CXL paths that need it, dynamic memory region creation and security commands (erase / unlock), are disabled when it is not present. As for the CXL 2.0+ this cycle the subsystem gains support Persistent Memory Security commands, error handling in response to PCIe AER notifications, and support for the "XOR" host bridge interleave algorithm. Summary: - Add the cpu_cache_invalidate_memregion() API for cache flushing in response to physical memory reconfiguration, or memory-side data invalidation from operations like secure erase or memory-device unlock. - Add a facility for the kernel to warn about collisions between kernel and userspace access to PCI configuration registers - Add support for Restricted CXL Host (RCH) topologies (formerly CXL 1.1) - Add handling and reporting of CXL errors reported via the PCIe AER mechanism - Add support for CXL Persistent Memory Security commands - Add support for the "XOR" algorithm for CXL host bridge interleave - Rework / simplify CXL to NVDIMM interactions - Miscellaneous cleanups and fixes" * tag 'cxl-for-6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (71 commits) cxl/region: Fix memdev reuse check cxl/pci: Remove endian confusion cxl/pci: Add some type-safety to the AER trace points cxl/security: Drop security command ioctl uapi cxl/mbox: Add variable output size validation for internal commands cxl/mbox: Enable cxl_mbox_send_cmd() users to validate output size cxl/security: Fix Get Security State output payload endian handling cxl: update names for interleave ways conversion macros cxl: update names for interleave granularity conversion macros cxl/acpi: Warn about an invalid CHBCR in an existing CHBS entry tools/testing/cxl: Require cache invalidation bypass cxl/acpi: Fail decoder add if CXIMS for HBIG is missing cxl/region: Fix spelling mistake "memergion" -> "memregion" cxl/regs: Fix sparse warning cxl/acpi: Set ACPI's CXL _OSC to indicate RCD mode support tools/testing/cxl: Add an RCH topology cxl/port: Add RCD endpoint port enumeration cxl/mem: Move devm_cxl_add_endpoint() from cxl_core to cxl_mem tools/testing/cxl: Add XOR Math support to cxl_test cxl/acpi: Support CXL XOR Interleave Math (CXIMS) ...	2022-12-12 13:55:31 -08:00
Linus Torvalds	9d33edb20f	Updates for the interrupt core and driver subsystem: - Core: The bulk is the rework of the MSI subsystem to support per device MSI interrupt domains. This solves conceptual problems of the current PCI/MSI design which are in the way of providing support for PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device. IMS (Interrupt Message Store] is a new specification which allows device manufactures to provide implementation defined storage for MSI messages contrary to the uniform and specification defined storage mechanisms for PCI/MSI and PCI/MSI-X. IMS not only allows to overcome the size limitations of the MSI-X table, but also gives the device manufacturer the freedom to store the message in arbitrary places, even in host memory which is shared with the device. There have been several attempts to glue this into the current MSI code, but after lengthy discussions it turned out that there is a fundamental design problem in the current PCI/MSI-X implementation. This needs some historical background. When PCI/MSI[-X] support was added around 2003, interrupt management was completely different from what we have today in the actively developed architectures. Interrupt management was completely architecture specific and while there were attempts to create common infrastructure the commonalities were rudimentary and just providing shared data structures and interfaces so that drivers could be written in an architecture agnostic way. The initial PCI/MSI[-X] support obviously plugged into this model which resulted in some basic shared infrastructure in the PCI core code for setting up MSI descriptors, which are a pure software construct for holding data relevant for a particular MSI interrupt, but the actual association to Linux interrupts was completely architecture specific. This model is still supported today to keep museum architectures and notorious stranglers alive. In 2013 Intel tried to add support for hot-pluggable IO/APICs to the kernel, which was creating yet another architecture specific mechanism and resulted in an unholy mess on top of the existing horrors of x86 interrupt handling. The x86 interrupt management code was already an incomprehensible maze of indirections between the CPU vector management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X] implementation. At roughly the same time ARM struggled with the ever growing SoC specific extensions which were glued on top of the architected GIC interrupt controller. This resulted in a fundamental redesign of interrupt management and provided the today prevailing concept of hierarchical interrupt domains. This allowed to disentangle the interactions between x86 vector domain and interrupt remapping and also allowed ARM to handle the zoo of SoC specific interrupt components in a sane way. The concept of hierarchical interrupt domains aims to encapsulate the functionality of particular IP blocks which are involved in interrupt delivery so that they become extensible and pluggable. The X86 encapsulation looks like this: \|--- device 1 [Vector]---[Remapping]---[PCI/MSI]--\|... \|--- device N where the remapping domain is an optional component and in case that it is not available the PCI/MSI[-X] domains have the vector domain as their parent. This reduced the required interaction between the domains pretty much to the initialization phase where it is obviously required to establish the proper parent relation ship in the components of the hierarchy. While in most cases the model is strictly representing the chain of IP blocks and abstracting them so they can be plugged together to form a hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the hardware it's clear that the actual PCI/MSI[-X] interrupt controller is not a global entity, but strict a per PCI device entity. Here we took a short cut on the hierarchical model and went for the easy solution of providing "global" PCI/MSI domains which was possible because the PCI/MSI[-X] handling is uniform across the devices. This also allowed to keep the existing PCI/MSI[-X] infrastructure mostly unchanged which in turn made it simple to keep the existing architecture specific management alive. A similar problem was created in the ARM world with support for IP block specific message storage. Instead of going all the way to stack a IP block specific domain on top of the generic MSI domain this ended in a construct which provides a "global" platform MSI domain which allows overriding the irq_write_msi_msg() callback per allocation. In course of the lengthy discussions we identified other abuse of the MSI infrastructure in wireless drivers, NTB etc. where support for implementation specific message storage was just mindlessly glued into the existing infrastructure. Some of this just works by chance on particular platforms but will fail in hard to diagnose ways when the driver is used on platforms where the underlying MSI interrupt management code does not expect the creative abuse. Another shortcoming of today's PCI/MSI-X support is the inability to allocate or free individual vectors after the initial enablement of MSI-X. This results in an works by chance implementation of VFIO (PCI pass-through) where interrupts on the host side are not set up upfront to avoid resource exhaustion. They are expanded at run-time when the guest actually tries to use them. The way how this is implemented is that the host disables MSI-X and then re-enables it with a larger number of vectors again. That works by chance because most device drivers set up all interrupts before the device actually will utilize them. But that's not universally true because some drivers allocate a large enough number of vectors but do not utilize them until it's actually required, e.g. for acceleration support. But at that point other interrupts of the device might be in active use and the MSI-X disable/enable dance can just result in losing interrupts and therefore hard to diagnose subtle problems. Last but not least the "global" PCI/MSI-X domain approach prevents to utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact that IMS is not longer providing a uniform storage and configuration model. The solution to this is to implement the missing step and switch from global PCI/MSI domains to per device PCI/MSI domains. The resulting hierarchy then looks like this: \|--- [PCI/MSI] device 1 [Vector]---[Remapping]---\|... \|--- [PCI/MSI] device N which in turn allows to provide support for multiple domains per device: \|--- [PCI/MSI] device 1 \|--- [PCI/IMS] device 1 [Vector]---[Remapping]---\|... \|--- [PCI/MSI] device N \|--- [PCI/IMS] device N This work converts the MSI and PCI/MSI core and the x86 interrupt domains to the new model, provides new interfaces for post-enable allocation/free of MSI-X interrupts and the base framework for PCI/IMS. PCI/IMS has been verified with the work in progress IDXD driver. There is work in progress to convert ARM over which will replace the platform MSI train-wreck. The cleanup of VFIO, NTB and other creative "solutions" are in the works as well. - Drivers: - Updates for the LoongArch interrupt chip drivers - Support for MTK CIRQv2 - The usual small fixes and updates all over the place -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmOUsygTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYoYXiD/40tXKzCzf0qFIqUlZLia1N3RRrwrNC DVTixuLtR9MrjwE+jWLQILa85SHInV8syXHSd35SzhsGDxkURFGi+HBgVWmysODf br9VSh3Gi+kt7iXtIwAg8WNWviGNmS3kPksxCko54F0YnJhMY5r5bhQVUBQkwFG2 wES1C9Uzd4pdV2bl24Z+WKL85cSmZ+pHunyKw1n401lBABXnTF9c4f13zC14jd+y wDxNrmOxeL3mEH4Pg6VyrDuTOURSf3TjJjeEq3EYqvUo0FyLt9I/cKX0AELcZQX7 fkRjrQQAvXNj39RJfeSkojDfllEPUHp7XSluhdBu5aIovSamdYGCDnuEoZ+l4MJ+ CojIErp3Dwj/uSaf5c7C3OaDAqH2CpOFWIcrUebShJE60hVKLEpUwd6W8juplaoT gxyXRb1Y+BeJvO8VhMN4i7f3232+sj8wuj+HTRTTbqMhkElnin94tAx8rgwR1sgR BiOGMJi4K2Y8s9Rqqp0Dvs01CW4guIYvSR4YY+WDbbi1xgiev89OYs6zZTJCJe4Y NUwwpqYSyP1brmtdDdBOZLqegjQm+TwUb6oOaasFem4vT1swgawgLcDnPOx45bk5 /FWt3EmnZxMz99x9jdDn1+BCqAZsKyEbEY1avvhPVMTwoVIuSX2ceTBMLseGq+jM 03JfvdxnueM3gw== =9erA -----END PGP SIGNATURE----- Merge tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq updates from Thomas Gleixner: "Updates for the interrupt core and driver subsystem: The bulk is the rework of the MSI subsystem to support per device MSI interrupt domains. This solves conceptual problems of the current PCI/MSI design which are in the way of providing support for PCI/MSI[-X] and the upcoming PCI/IMS mechanism on the same device. IMS (Interrupt Message Store] is a new specification which allows device manufactures to provide implementation defined storage for MSI messages (as opposed to PCI/MSI and PCI/MSI-X that has a specified message store which is uniform accross all devices). The PCI/MSI[-X] uniformity allowed us to get away with "global" PCI/MSI domains. IMS not only allows to overcome the size limitations of the MSI-X table, but also gives the device manufacturer the freedom to store the message in arbitrary places, even in host memory which is shared with the device. There have been several attempts to glue this into the current MSI code, but after lengthy discussions it turned out that there is a fundamental design problem in the current PCI/MSI-X implementation. This needs some historical background. When PCI/MSI[-X] support was added around 2003, interrupt management was completely different from what we have today in the actively developed architectures. Interrupt management was completely architecture specific and while there were attempts to create common infrastructure the commonalities were rudimentary and just providing shared data structures and interfaces so that drivers could be written in an architecture agnostic way. The initial PCI/MSI[-X] support obviously plugged into this model which resulted in some basic shared infrastructure in the PCI core code for setting up MSI descriptors, which are a pure software construct for holding data relevant for a particular MSI interrupt, but the actual association to Linux interrupts was completely architecture specific. This model is still supported today to keep museum architectures and notorious stragglers alive. In 2013 Intel tried to add support for hot-pluggable IO/APICs to the kernel, which was creating yet another architecture specific mechanism and resulted in an unholy mess on top of the existing horrors of x86 interrupt handling. The x86 interrupt management code was already an incomprehensible maze of indirections between the CPU vector management, interrupt remapping and the actual IO/APIC and PCI/MSI[-X] implementation. At roughly the same time ARM struggled with the ever growing SoC specific extensions which were glued on top of the architected GIC interrupt controller. This resulted in a fundamental redesign of interrupt management and provided the today prevailing concept of hierarchical interrupt domains. This allowed to disentangle the interactions between x86 vector domain and interrupt remapping and also allowed ARM to handle the zoo of SoC specific interrupt components in a sane way. The concept of hierarchical interrupt domains aims to encapsulate the functionality of particular IP blocks which are involved in interrupt delivery so that they become extensible and pluggable. The X86 encapsulation looks like this: \|--- device 1 [Vector]---[Remapping]---[PCI/MSI]--\|... \|--- device N where the remapping domain is an optional component and in case that it is not available the PCI/MSI[-X] domains have the vector domain as their parent. This reduced the required interaction between the domains pretty much to the initialization phase where it is obviously required to establish the proper parent relation ship in the components of the hierarchy. While in most cases the model is strictly representing the chain of IP blocks and abstracting them so they can be plugged together to form a hierarchy, the design stopped short on PCI/MSI[-X]. Looking at the hardware it's clear that the actual PCI/MSI[-X] interrupt controller is not a global entity, but strict a per PCI device entity. Here we took a short cut on the hierarchical model and went for the easy solution of providing "global" PCI/MSI domains which was possible because the PCI/MSI[-X] handling is uniform across the devices. This also allowed to keep the existing PCI/MSI[-X] infrastructure mostly unchanged which in turn made it simple to keep the existing architecture specific management alive. A similar problem was created in the ARM world with support for IP block specific message storage. Instead of going all the way to stack a IP block specific domain on top of the generic MSI domain this ended in a construct which provides a "global" platform MSI domain which allows overriding the irq_write_msi_msg() callback per allocation. In course of the lengthy discussions we identified other abuse of the MSI infrastructure in wireless drivers, NTB etc. where support for implementation specific message storage was just mindlessly glued into the existing infrastructure. Some of this just works by chance on particular platforms but will fail in hard to diagnose ways when the driver is used on platforms where the underlying MSI interrupt management code does not expect the creative abuse. Another shortcoming of today's PCI/MSI-X support is the inability to allocate or free individual vectors after the initial enablement of MSI-X. This results in an works by chance implementation of VFIO (PCI pass-through) where interrupts on the host side are not set up upfront to avoid resource exhaustion. They are expanded at run-time when the guest actually tries to use them. The way how this is implemented is that the host disables MSI-X and then re-enables it with a larger number of vectors again. That works by chance because most device drivers set up all interrupts before the device actually will utilize them. But that's not universally true because some drivers allocate a large enough number of vectors but do not utilize them until it's actually required, e.g. for acceleration support. But at that point other interrupts of the device might be in active use and the MSI-X disable/enable dance can just result in losing interrupts and therefore hard to diagnose subtle problems. Last but not least the "global" PCI/MSI-X domain approach prevents to utilize PCI/MSI[-X] and PCI/IMS on the same device due to the fact that IMS is not longer providing a uniform storage and configuration model. The solution to this is to implement the missing step and switch from global PCI/MSI domains to per device PCI/MSI domains. The resulting hierarchy then looks like this: \|--- [PCI/MSI] device 1 [Vector]---[Remapping]---\|... \|--- [PCI/MSI] device N which in turn allows to provide support for multiple domains per device: \|--- [PCI/MSI] device 1 \|--- [PCI/IMS] device 1 [Vector]---[Remapping]---\|... \|--- [PCI/MSI] device N \|--- [PCI/IMS] device N This work converts the MSI and PCI/MSI core and the x86 interrupt domains to the new model, provides new interfaces for post-enable allocation/free of MSI-X interrupts and the base framework for PCI/IMS. PCI/IMS has been verified with the work in progress IDXD driver. There is work in progress to convert ARM over which will replace the platform MSI train-wreck. The cleanup of VFIO, NTB and other creative "solutions" are in the works as well. Drivers: - Updates for the LoongArch interrupt chip drivers - Support for MTK CIRQv2 - The usual small fixes and updates all over the place" * tag 'irq-core-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (134 commits) irqchip/ti-sci-inta: Fix kernel doc irqchip/gic-v2m: Mark a few functions __init irqchip/gic-v2m: Include arm-gic-common.h irqchip/irq-mvebu-icu: Fix works by chance pointer assignment iommu/amd: Enable PCI/IMS iommu/vt-d: Enable PCI/IMS x86/apic/msi: Enable PCI/IMS PCI/MSI: Provide pci_ims_alloc/free_irq() PCI/MSI: Provide IMS (Interrupt Message Store) support genirq/msi: Provide constants for PCI/IMS support x86/apic/msi: Enable MSI_FLAG_PCI_MSIX_ALLOC_DYN PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X PCI/MSI: Provide prepare_desc() MSI domain op PCI/MSI: Split MSI-X descriptor setup genirq/msi: Provide MSI_FLAG_MSIX_ALLOC_DYN genirq/msi: Provide msi_domain_alloc_irq_at() genirq/msi: Provide msi_domain_ops:: Prepare_desc() genirq/msi: Provide msi_desc:: Msi_data genirq/msi: Provide struct msi_map x86/apic/msi: Remove arch_create_remap_msi_irq_domain() ...	2022-12-12 11:21:29 -08:00
Bjorn Helgaas	f826afe5ea	Merge branch 'pci/kbuild' - Remove unnecessary <linux/of_irq.h> includes (Bjorn Helgaas) * pci/kbuild: PCI: Drop of_match_ptr() to avoid unused variables PCI: Remove unnecessary <linux/of_irq.h> includes PCI: xgene-msi: Include <linux/irqdomain.h> explicitly PCI: mvebu: Include <linux/irqdomain.h> explicitly PCI: microchip: Include <linux/irqdomain.h> explicitly PCI: altera-msi: Include <linux/irqdomain.h> explicitly # Conflicts: # drivers/pci/controller/pci-mvebu.c	2022-12-10 10:36:52 -06:00
Bjorn Helgaas	e4d741e9e4	Merge branch 'pci/ctrl/xilinx' - Fix whitespace issues (Michal Simek) * pci/ctrl/xilinx: PCI: xilinx-nwl: Fix coding style violations	2022-12-10 10:36:42 -06:00
Bjorn Helgaas	4e5194733a	Merge branch 'pci/ctrl/mvebu' - Switch to the gpiod API so we can make of_get_named_gpio_flags() private (Dmitry Torokhov) * pci/ctrl/mvebu: PCI: mvebu: Switch to using gpiod API	2022-12-10 10:36:41 -06:00
Bjorn Helgaas	0454c6c0ed	Merge branch 'pci/ctrl/aardvark' - Switch to using devm_gpiod_get_optional() so we can stop exporting devm_gpiod_get_from_of_node() (Dmitry Torokhov) * pci/ctrl/aardvark: PCI: aardvark: Switch to using devm_gpiod_get_optional()	2022-12-10 10:36:40 -06:00
Bjorn Helgaas	bcccaa0a48	Merge branch 'remotes/lorenzo/pci/misc' - Register notifier if core_init_notifier is enabled in pci-epf-test (Kunihiko Hayashi) - Fixup Kconfig indentation (Shunsuke Mie) * remotes/lorenzo/pci/misc: PCI: endpoint: Fix Kconfig indent style PCI: pci-epf-test: Register notifier if only core_init_notifier is enabled	2022-12-10 10:36:40 -06:00
Bjorn Helgaas	ba7deaa2a8	Merge branch 'remotes/lorenzo/pci/vmd' - Restore MSI remapping configuration during resume because the configuration is cleared out by firmware when suspending (Nirmal Patel) - Reset the hierarchy below VMD when probing the VMD; we attempted this before, but with the wrong device, so it didn't work (Francisco Munoz) * remotes/lorenzo/pci/vmd: PCI: vmd: Fix secondary bus reset for Intel bridges PCI: vmd: Disable MSI remapping after suspend	2022-12-10 10:36:39 -06:00
Bjorn Helgaas	4e5db7983d	Merge branch 'remotes/lorenzo/pci/tegra' - Switch from devm_gpiod_get_from_of_node() to devm_fwnode_gpiod_get() (Dmitry Torokhov) * remotes/lorenzo/pci/tegra: PCI: tegra: Switch to using devm_fwnode_gpiod_get	2022-12-10 10:36:39 -06:00
Bjorn Helgaas	008ee711f9	Merge branch 'remotes/lorenzo/pci/qcom' - Add DT and driver support for SC8280XP/SA8540P basic interconnects where interconnect bandwidth must be requested before enabling interconnect clocks (Johan Hovold) - Add 'dma-coherent' property (Johan Hovold) * remotes/lorenzo/pci/qcom: dt-bindings: PCI: qcom: Allow 'dma-coherent' property PCI: qcom: Add basic interconnect support dt-bindings: PCI: qcom: Add SC8280XP/SA8540P interconnects	2022-12-10 10:36:38 -06:00
Bjorn Helgaas	8ecdba32a5	Merge branch 'remotes/lorenzo/pci/mt7621' - Add sentinel to mt7621_pcie_quirks_match[] to prevent oops when parsing the table (John Thomson) * remotes/lorenzo/pci/mt7621: PCI: mt7621: Add sentinel to quirks table	2022-12-10 10:36:38 -06:00
Bjorn Helgaas	c00a109054	Merge branch 'remotes/lorenzo/pci/endpoint' - Add a .release() callback for the Endpoint Controller library so an Endpoint driver is removable (Yoshihiro Shimoda) - Fix pci-epf-vntb kernel-doc and whitespace (Frank Li) - Fix pci-epf-vntb error path usage of pci_epc_mem_free_addr() (Frank Li) - Remove pci-epf-vntb unused epf_db_phy (Frank Li) - Fix pci-epf-vntb sparse warnings (Frank Li) * remotes/lorenzo/pci/endpoint: PCI: endpoint: pci-epf-vntb: Fix sparse ntb->reg build warning PCI: endpoint: pci-epf-vntb: Fix sparse build warning for epf_db PCI: endpoint: pci-epf-vntb: Replace hardcoded 4 with sizeof(u32) PCI: endpoint: pci-epf-vntb: Remove unused epf_db_phy struct member PCI: endpoint: pci-epf-vntb: Fix call pci_epc_mem_free_addr() in error path PCI: endpoint: pci-epf-vntb: Fix struct epf_ntb_ctrl indentation PCI: endpoint: pci-epf-vntb: Clean up kernel_doc warning PCI: endpoint: Fix WARN() when an endpoint driver is removed	2022-12-10 10:36:37 -06:00
Bjorn Helgaas	29a3e5aedc	Merge branch 'remotes/lorenzo/pci/dwc' - Fix n_fts[] array overrun (Vidya Sagar) - Don't advertise PTM Responder role for Endpoints (Vidya Sagar) - Fix qcom "reset assert" error message (Manivannan Sadhasivam) - Downgrade "link didn't come up" message to dev_info (Vidya Sagar) - Initialize PHY before deasserting core reset so the link comes up on boards where the PHY provides the reference clock (this was a regression in v6.0) (Sascha Hauer) - Switch histb to the gpiod API (Dmitry Torokhov) - Fix imx6sx and imx8mq clock names in DT binding (Serge Semin) - Fix visconti MSI interrupt in DT binding (Serge Semin) - Consolidate reset-gpio, cdm, windows info in common DT shared by both Root Port and Endpoint bindings (Serge Semin) - Remove bus node from DT examples (Serge Semin) - Add common phys, phy-names to DT (Serge Semin) - Add default max-link-speed of Gen5 to DT (Serge Semin) - Apply generic schema for generic device (Serge Semin) - Add default max-functions of 32 to DT (Serge Semin) - Add common interrupts, interrupt-names to DT (Serge Semin) - Add common regs, reg-names to DT (Serge Semin) - Add common clocks, resets to DT (Serge Semin) - Add dma-coherent to DT (Serge Semin) - Apply common schema to Rockchip DT (Serge Semin) - Add Baikal-T1 DT bindings (Serge Semin) - Add dma-ranges support in DesignWare core (Serge Semin) - Add dw_pcie_cap_is() for testing controller capabilities (Serge Semin) - Add generic resources getter to DesignWare core (Serge Semin) - Combine iATU detection procedures (Serge Semin) - Add generic clock and reset names to DesignWare core (Serge Semin) - Add Baikal-T1 PCIe controller driver (Serge Semin) * remotes/lorenzo/pci/dwc: PCI: dwc: Add Baikal-T1 PCIe controller support PCI: dwc: Introduce generic platform clocks and resets PCI: dwc: Combine iATU detection procedures PCI: dwc: Introduce generic resources getter PCI: dwc: Introduce generic controller capabilities interface PCI: dwc: Introduce dma-ranges property support for RC-host dt-bindings: PCI: dwc: Add Baikal-T1 PCIe Root Port bindings dt-bindings: PCI: dwc: Apply common schema to Rockchip DW PCIe nodes dt-bindings: PCI: dwc: Add dma-coherent property dt-bindings: PCI: dwc: Add clocks/resets common properties dt-bindings: PCI: dwc: Add reg/reg-names common properties dt-bindings: PCI: dwc: Add interrupts/interrupt-names common properties dt-bindings: PCI: dwc: Add max-functions EP property dt-bindings: PCI: dwc: Apply generic schema for generic device only dt-bindings: PCI: dwc: Add max-link-speed common property dt-bindings: PCI: dwc: Add phys/phy-names common properties dt-bindings: PCI: dwc: Remove bus node from the examples dt-bindings: PCI: dwc: Detach common RP/EP DT bindings dt-bindings: visconti-pcie: Fix interrupts array max constraints dt-bindings: imx6q-pcie: Fix clock names for imx6sx and imx8mq PCI: histb: Switch to using gpiod API PCI: imx6: Initialize PHY before deasserting core reset PCI: dwc: Use dev_info for PCIe link down event logging PCI: qcom: Fix error message for reset_control_assert() PCI: designware-ep: Disable PTM capabilities for EP mode PCI: Add PCI_PTM_CAP_RES macro PCI: dwc: Fix n_fts[] array overrun	2022-12-10 10:36:37 -06:00
Bjorn Helgaas	0ef283080e	Merge branch 'remotes/lorenzo/pci/brcmstb' - Enable Multi-MSI (Jim Quinlan) - Wait for 100ms after PERST# deassert for power and clocks to stabilize (Jim Quinlan) - Use readl_poll_timeout_atomic() instead of hand-rolled timeout loop (Jim Quinlan) - Drop needless "inline" annotations (Jim Quinlan) - Set RCB_MPS mode bit so data for reads up to MPS are returned in a single completion (Jim Quinlan) * remotes/lorenzo/pci/brcmstb: PCI: brcmstb: Set RCB_{MPS,64B}_MODE bits PCI: brcmstb: Drop needless 'inline' annotations PCI: brcmstb: Replace status loops with read_poll_timeout_atomic() PCI: brcmstb: Wait for 100ms following PERST# deassert PCI: brcmstb: Enable Multi-MSI	2022-12-10 10:36:36 -06:00
Bjorn Helgaas	0084cd6072	Merge branch 'pci/sysfs' - Fix a double free in the error path of creating sysfs "resource%d" attributes (Sascha Hauer) * pci/sysfs: PCI/sysfs: Fix double free in error path	2022-12-10 10:36:35 -06:00
Bjorn Helgaas	8961fc4f8c	Merge branch 'pci/resource' - Remove EfiMemoryMappedIO regions from the E820 map to allow PCI core to allocate BARs from them. The only purpose of EfiMemoryMappedIO is to tell the OS to map things needed by EFI runtime services, so it's often used for PCI host bridge apertures. If we can't allocate from those apertures, we can't hot-add devices (Bjorn Helgaas) * pci/resource: x86/PCI: Use pr_info() when possible x86/PCI: Fix log message typo x86/PCI: Tidy E820 removal messages PCI: Skip allocate_resource() if too little space available efi/x86: Remove EfiMemoryMappedIO from E820 map	2022-12-10 10:36:34 -06:00
Bjorn Helgaas	9303050181	Merge branch 'pci/portdrv' - Squash portdrv_core.c and portdrv_pci.c into portdrv.c to make it easier to find things (Bjorn Helgaas) - Allow AER service only for Root Ports & RCECs so portdrv can successfully bind to other devices that have AER but lack MSI (which they don't need for AER), which allows power management for those devices (Bjorn Helgaas) * pci/portdrv: PCI/portdrv: Allow AER service only for Root Ports & RCECs PCI/portdrv: Unexport pcie_port_service_register(), pcie_port_service_unregister() PCI/portdrv: Move private things to portdrv.c PCI/portdrv: Squash into portdrv.c	2022-12-10 10:36:34 -06:00
Bjorn Helgaas	e1f2d15397	Merge branch 'pci/pm' - Remove unused 'state' parameter to pci_legacy_suspend_late() (Bjorn Helgaas) * pci/pm: PCI/PM: Remove unused 'state' parameter to pci_legacy_suspend_late()	2022-12-10 10:36:33 -06:00
Bjorn Helgaas	eae10935ef	Merge branch 'pci/misc' - Use METHOD_NAME__UID instead of plain string to make it easier to find all uses (Yipeng Zou) * pci/misc: PCI/ACPI: Use METHOD_NAME__UID instead of plain string	2022-12-10 10:36:32 -06:00
Bjorn Helgaas	84c3482963	Merge branch 'pci/hotplug' - Enable pciehp by default if USB4 is enabled because USB4/Thunderbolt tunneling depends on native PCIe hotplug (Albert Zhou) - Make sure pciehp binds only to Downstream Ports, not Upstream Ports (Rafael J. Wysocki) - Remove unused get_mode1_ECC_cap callback in shpchp (Ian Cowan) - Enable pciehp Command Completed Interrupt only if supported to reduce confusion when looking at lspci output (Pali Rohár) * pci/hotplug: PCI: pciehp: Enable Command Completed Interrupt only if supported PCI: shpchp: Remove unused get_mode1_ECC_cap callback PCI: acpiphp: Avoid setting is_hotplug_bridge for PCIe Upstream Ports PCI/portdrv: Set PCIE_PORT_SERVICE_HP for Root and Downstream Ports only PCI: pciehp: Enable by default if USB4 enabled	2022-12-10 10:36:32 -06:00
Bjorn Helgaas	51ef4873c6	Merge branch 'pci/enumeration' - Only read/write PCIe Link 2 registers for devices with Links and PCIe Capability version >= 2 (Maciej W. Rozycki) - Revert a patch that cleared PCI_STATUS during enumeration because it broke Linux guests on Apple's virtualization framework (Bjorn Helgaas) - Assign PCI domain IDs using IDAs so IDs can be easily reused after loading/unloading host bridge drivers (Pali Rohár) - Fix pci_device_is_present(), which previously always returned "false" for VFs because their vendor ID is always 0xfff (Michael S. Tsirkin) - Check for alloc failure in pci_request_irq() (Zeng Heng) * pci/enumeration: PCI: Check for alloc failure in pci_request_irq() PCI: Fix pci_device_is_present() for VFs by checking PF PCI: Assign PCI domain IDs by ida_alloc() Revert "PCI: Clear PCI_STATUS when setting up device" PCI: Access Link 2 registers only for devices with Links	2022-12-10 10:36:32 -06:00
Bjorn Helgaas	5c5fb3c3a7	PCI: Skip allocate_resource() if too little space available pci_bus_alloc_from_region() allocates MMIO space by iterating through all the resources available on the bus. The available resource might be reduced if the caller requires 32-bit space or we're avoiding BIOS or E820 areas. Don't bother calling allocate_resource() if we need more space than is available in this resource. This prevents some pointless and annoying messages about avoided areas. Link: https://lore.kernel.org/r/20221208190341.1560157-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Hans de Goede <hdegoede@redhat.com>	2022-12-10 10:31:47 -06:00
Bjorn Helgaas	d8d2b65a94	PCI/portdrv: Allow AER service only for Root Ports & RCECs Previously portdrv allowed the AER service for any device with an AER capability (assuming Linux had control of AER) even though the AER service driver only attaches to Root Port and RCECs. Because get_port_device_capability() included AER for non-RP, non-RCEC devices, we tried to initialize the AER IRQ even though these devices don't generate AER interrupts. Intel DG1 and DG2 discrete graphics cards contain a switch leading to a GPU. The switch supports AER but not MSI, so initializing an AER IRQ failed, and portdrv failed to claim the switch port at all. The GPU itself could be suspended, but the switch could not be put in a low-power state because it had no driver. Don't allow the AER service on non-Root Port, non-Root Complex Event Collector devices. This means we won't enable Bus Mastering if the device doesn't require MSI, the AER service will not appear in sysfs, and the AER service driver will not bind to the device. Link: https://lore.kernel.org/r/20221207084105.84947-1-mika.westerberg@linux.intel.com Link: https://lore.kernel.org/r/20221210002922.1749403-1-helgaas@kernel.org Based-on-patch-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>	2022-12-10 10:26:40 -06:00
Michal Simek	c1ddc3dad8	PCI: xilinx-nwl: Fix coding style violations Fix code alignments and remove additional newline. Link: https://lore.kernel.org/r/17c75e7003bb8c43a0f45ae3d7c45cac230ef852.1670503129.git.michal.simek@amd.com Signed-off-by: Michal Simek <michal.simek@amd.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2022-12-08 10:50:48 -06:00
Dmitry Torokhov	76007ccc57	PCI: mvebu: Switch to using gpiod API Switch the driver away from legacy gpio/of_gpio API to gpiod API, and remove use of of_get_named_gpio_flags() which I want to make private to gpiolib. Link: https://lore.kernel.org/r/Y5EAft42YiT66mVj@google.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>	2022-12-07 16:03:03 -06:00
Pali Rohár	6d4671b534	PCI: pciehp: Enable Command Completed Interrupt only if supported The No Command Completed Support bit in the Slot Capabilities register indicates whether Command Completed Interrupt Enable is unsupported. We already check whether No Command Completed Support bit is set in pcie_wait_cmd(), and do not wait in this case. Don't enable this Command Completed Interrupt at all if NCCS is set, so that when users dump configuration space from userspace, the dump does not confuse them by saying that Command Completed Interrupt is not supported, but it is enabled. Link: https://lore.kernel.org/r/20220927141926.8895-2-kabel@kernel.org Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de>	2022-12-07 08:27:20 -06:00
Dmitry Torokhov	7ccb966779	PCI: aardvark: Switch to using devm_gpiod_get_optional() Switch the driver to the generic version of gpiod API (and away from OF-specific variant), so that we can stop exporting devm_gpiod_get_from_of_node(). Link: https://lore.kernel.org/r/Y3KMEZFv6dpxA+Gv@google.com Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Linus Walleij <linus.walleij@linaro.org> Acked-by: Pali Rohár <pali@kernel.org>	2022-12-07 08:19:53 -06:00
John Thomson	19098934f9	PCI: mt7621: Add sentinel to quirks table Current driver is missing a sentinel in the struct soc_device_attribute array, which causes an oops when assessed by the soc_device_match(mt7621_pcie_quirks_match) call. This was only exposed once the CONFIG_SOC_MT7621 mt7621 soc_dev_attr was fixed to register the SOC as a device, in: commit `7c18b64bba` ("mips: ralink: mt7621: do not use kzalloc too early") Fix it by adding the required sentinel. Link: https://lore.kernel.org/lkml/26ebbed1-0fe9-4af9-8466-65f841d0b382@app.fastmail.com Link: https://lore.kernel.org/r/20221205204645.301301-1-git@johnthomson.fastmail.com.au Fixes: `b483b4e4d3` ("staging: mt7621-pci: add quirks for 'E2' revision using 'soc_device_attribute'") Signed-off-by: John Thomson <git@johnthomson.fastmail.com.au> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>	2022-12-06 12:04:23 +01:00
Francisco Munoz	0a584655ef	PCI: vmd: Fix secondary bus reset for Intel bridges The reset was never applied in the current implementation because Intel Bridges owned by VMD are parentless. Internally, pci_reset_bus() applies a reset to the parent of the PCI device supplied as argument, but in this case it failed because there wasn't a parent. In more detail, this change allows the VMD driver to enumerate NVMe devices in pass-through configurations when guest reboots are performed. There was an attempted to fix this, but later we discovered that the code inside pci_reset_bus() wasn’t triggering secondary bus resets. Therefore, we updated the parameters passed to it, and now NVMe SSDs attached to VMD bridges are properly enumerated in VT-d pass-through scenarios. Link: https://lore.kernel.org/r/20221206001637.4744-1-francisco.munoz.ruiz@linux.intel.com Fixes: `6aab562229` ("PCI: vmd: Clean up domain before enumeration") Signed-off-by: Francisco Munoz <francisco.munoz.ruiz@linux.intel.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Reviewed-by: Nirmal Patel <nirmal.patel@linux.intel.com> Reviewed-by: Jonathan Derrick <jonathan.derrick@linux.dev>	2022-12-06 11:45:25 +01:00
Thomas Gleixner	c9e5bea273	PCI/MSI: Provide pci_ims_alloc/free_irq() Single vector allocation which allocates the next free index in the IMS space. The free function releases. All allocated vectors are released also via pci_free_vectors() which is also releasing MSI/MSI-X vectors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.961711347@linutronix.de	2022-12-05 22:22:35 +01:00
Thomas Gleixner	0194425af0	PCI/MSI: Provide IMS (Interrupt Message Store) support IMS (Interrupt Message Store) is a new specification which allows implementation specific storage of MSI messages contrary to the strict standard specified MSI and MSI-X message stores. This requires new device specific interrupt domains to handle the implementation defined storage which can be an array in device memory or host/guest memory which is shared with hardware queues. Add a function to create IMS domains for PCI devices. IMS domains are using the new per device domain mechanism and are configured by the device driver via a template. IMS domains are created as secondary device domains so they work side on side with MSI[-X] on the same device. The IMS domains have a few constraints: - The index space is managed by the core code. Device memory based IMS provides a storage array with a fixed size which obviously requires an index. But there is no association between index and functionality so the core can randomly allocate an index in the array. System memory based IMS does not have the concept of an index as the storage is somewhere in memory. In that case the index is purely software based to keep track of the allocations. - There is no requirement for consecutive index ranges This is currently a limitation of the MSI core and can be implemented if there is a justified use case by changing the internal storage from xarray to maple_tree. For now it's single vector allocation. - The interrupt chip must provide the following callbacks: - irq_mask() - irq_unmask() - irq_write_msi_msg() - The interrupt chip must provide the following optional callbacks when the irq_mask(), irq_unmask() and irq_write_msi_msg() callbacks cannot operate directly on hardware, e.g. in the case that the interrupt message store is in queue memory: - irq_bus_lock() - irq_bus_unlock() These callbacks are invoked from preemptible task context and are allowed to sleep. In this case the mandatory callbacks above just store the information. The irq_bus_unlock() callback is supposed to make the change effective before returning. - Interrupt affinity setting is handled by the underlying parent interrupt domain and communicated to the IMS domain via irq_write_msi_msg(). IMS domains cannot have a irq_set_affinity() callback. That's a reasonable restriction similar to the PCI/MSI device domain implementations. The domain is automatically destroyed when the PCI device is removed. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.904316841@linutronix.de	2022-12-05 22:22:34 +01:00
Thomas Gleixner	34026364df	PCI/MSI: Provide post-enable dynamic allocation interfaces for MSI-X MSI-X vectors can be allocated after the initial MSI-X enablement, but this needs explicit support of the underlying interrupt domains. Provide a function to query the ability and functions to allocate/free individual vectors post-enable. The allocation can either request a specific index in the MSI-X table or with the index argument MSI_ANY_INDEX it allocates the next free vector. The return value is a struct msi_map which on success contains both index and the Linux interrupt number. In case of failure index is negative and the Linux interrupt number is 0. The allocation function is for a single MSI-X index at a time as that's sufficient for the most urgent use case VFIO to get rid of the 'disable MSI-X, reallocate, enable-MSI-X' cycle which is prone to lost interrupts and redirections to the legacy and obviously unhandled INTx. As single index allocation is also sufficient for the use cases Jason Gunthorpe pointed out: Allocation of a MSI-X or IMS vector for a network queue. See Link below. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/all/20211126232735.547996838@linutronix.de Link: https://lore.kernel.org/r/20221124232326.731233614@linutronix.de	2022-12-05 22:22:34 +01:00
Thomas Gleixner	73bd063ca0	PCI/MSI: Provide prepare_desc() MSI domain op The setup of MSI descriptors for PCI/MSI-X interrupts depends partially on the MSI index for which the descriptor is initialized. Dynamic MSI-X vector allocation post MSI-X enablement allows to allocate vectors at a given index or at any free index in the available table range. The latter requires that the descriptor is initialized after the MSI core has chosen an index. Implement the prepare_desc() op in the PCI/MSI-X specific msi_domain_ops which is invoked before the core interrupt descriptor and the associated Linux interrupt number is allocated. That callback is also provided for the upcoming PCI/IMS implementations so the implementation specific interrupt domain can do their domain specific initialization of the MSI descriptors. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.673658806@linutronix.de	2022-12-05 22:22:34 +01:00
Thomas Gleixner	612ad43330	PCI/MSI: Split MSI-X descriptor setup The upcoming mechanism to allocate MSI-X vectors after enabling MSI-X needs to share some of the MSI-X descriptor setup. The regular descriptor setup on enable has the following code flow: 1) Allocate descriptor 2) Setup descriptor with PCI specific data 3) Insert descriptor 4) Allocate interrupts which in turn scans the inserted descriptors This cannot be easily changed because the PCI/MSI code needs to handle the legacy architecture specific allocation model and the irq domain model where quite some domains have the assumption that the above flow is how it works. Ideally the code flow should look like this: 1) Invoke allocation at the MSI core 2) MSI core allocates descriptor 3) MSI core calls back into the irq domain which fills in the domain specific parts This could be done for underlying parent MSI domains which support post-enable allocation/free but that would create significantly different code pathes for MSI/MSI-X enable. Though for dynamic allocation which wants to share the allocation code with the upcoming PCI/IMS support it's the right thing to do. Split the MSI-X descriptor setup into the preallocation part which just sets the index and fills in the horrible hack of virtual IRQs and the real PCI specific MSI-X setup part which solely depends on the index in the descriptor. This allows to provide a common dynamic allocation interface at the MSI core level for both PCI/MSI-X and PCI/IMS. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.616292598@linutronix.de	2022-12-05 22:22:34 +01:00
Thomas Gleixner	45c0402457	PCI/MSI: Remove unused pci_dev_has_special_msi_domain() The check for special MSI domains like VMD which prevents the interrupt remapping code to overwrite device::msi::domain is not longer required and has been replaced by an x86 specific version which is aware of MSI parent domains. Remove it. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232326.093093200@linutronix.de	2022-12-05 22:22:33 +01:00
Thomas Gleixner	15c72f824b	PCI/MSI: Add support for per device MSI[X] domains Provide a template and the necessary callbacks to create PCI/MSI and PCI/MSI-X domains. The domains are created when MSI or MSI-X is enabled. The domain's lifetime is either the device lifetime or in case that e.g. MSI-X was tried first and failed, then the MSI-X domain is removed and a MSI domain is created as both are mutually exclusive and reside in the default domain ID slot of the per device domain pointer array. Also expand pci_msi_domain_supports() to handle feature checks correctly even in the case that the per device domain was not yet created by checking the features supported by the MSI parent. Add the necessary setup calls into the MSI and MSI-X enable code path. These setup calls are backwards compatible. They return success when there is no parent domain found, which means the existing global domains or the legacy allocation path keep just working. Co-developed-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232325.975388241@linutronix.de	2022-12-05 22:22:32 +01:00
Thomas Gleixner	877d6c4e93	PCI/MSI: Split __pci_write_msi_msg() The upcoming per device MSI domains will create different domains for MSI and MSI-X. Split the write message function into MSI and MSI-X helpers so they can be used by those new domain functions seperately. Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124232325.857982142@linutronix.de	2022-12-05 22:22:32 +01:00
Dan Williams	e0f6fa0d42	Merge branch 'for-6.2/cxl-aer' into for-6.2/cxl Pick up CXL AER handling and correctable error extensions. Resolve conflicts with cxl_pmem_wq reworks and RCH support.	2022-12-05 12:31:30 -08:00
Thomas Gleixner	d3a11dee9f	PCI/MSI: Use msi_domain_alloc/free_irqs_all_locked() Switch to the new domain id aware interfaces to phase out the previous ones. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124230314.455168748@linutronix.de	2022-12-05 19:21:00 +01:00
Thomas Gleixner	1c89396300	genirq/msi: Rename msi_add_msi_desc() to msi_insert_msi_desc() This reflects the functionality better. No functional change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Acked-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20221124230314.103554618@linutronix.de	2022-12-05 19:20:59 +01:00
Bagas Sanjaya	6842694c50	PCI/MSI: Use bullet lists in kernel-doc comments of api.c Use bullet-list RST syntax for kernel-doc parameters' flags and interrupt mode descriptions. Otherwise Sphinx produces "Unexpected identation" errors and warnings. Fixes: `5c0997dc33` ("PCI/MSI: Move pci_alloc_irq_vectors() to api.c") Fixes: `017239c8db` ("PCI/MSI: Move pci_irq_vector() to api.c") Fixes: `be37b8428b` ("PCI/MSI: Move pci_irq_get_affinity() to api.c") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Suggested-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ahmed S. Darwish <darwi@linutronix.de> Link: https://lore.kernel.org/r/20221203100511.222136-1-bagasdotme@gmail.com	2022-12-05 18:57:46 +01:00

... 2 3 4 5 6 ...

10182 Commits