linux

iv/linux

Author	SHA1	Message	Date
Rick Wertenbroek	eab9d5a846	PCI: rockchip-ep: Remove wrong mask on subsys_vendor_id commit 2dba285caba53f309d6060fca911b43d63f41697 upstream. Remove wrong mask on subsys_vendor_id. Both the Vendor ID and Subsystem Vendor ID are u16 variables and are written to a u32 register of the controller. The Subsystem Vendor ID was always 0 because the u16 value was masked incorrectly with GENMASK(31,16) resulting in all lower 16 bits being set to 0 prior to the shift. Remove both masks as they are unnecessary and set the register correctly i.e., the lower 16-bits are the Vendor ID and the upper 16-bits are the Subsystem Vendor ID. This is documented in the RK3399 TRM section 17.6.7.1.17 [kwilczynski: removed unnecesary newline] Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller") Link: https://lore.kernel.org/linux-pci/20240403144508.489835-1-rick.wertenbroek@gmail.com Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-21 14:35:56 +02:00
Kuppuswamy Sathyanarayanan	afb634b085	PCI/EDR: Align EDR_PORT_LOCATE_DSM with PCI Firmware r3.3 [ Upstream commit e2e78a294a8a863898b781dbcf90e087eda3155d ] The "Downstream Port Containment related Enhancements" ECN of Jan 28, 2019 (document 12888 below), defined the EDR_PORT_LOCATE_DSM function with Revision ID 5 with a return value encoding (Bits 2:0 = Function, Bits 7:3 = Device, Bits 15:8 = Bus). When the ECN was integrated into PCI Firmware r3.3, sec 4.6.13, Bit 31 was added to indicate success or failure. Check Bit 31 for failure in acpi_dpc_port_get(). Link: https://lore.kernel.org/r/20240501022543.1626025-1-sathyanarayanan.kuppuswamy@linux.intel.com Link: https://members.pcisig.com/wg/PCI-SIG/document/12888 Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support") Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> [bhelgaas: split into two patches, update commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Satish Thatchanamurthy <Satish.Thatchanamurt@Dell.com> # one platform Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:03:41 +02:00
Kuppuswamy Sathyanarayanan	bdfaba14d8	PCI/EDR: Align EDR_PORT_DPC_ENABLE_DSM with PCI Firmware r3.3 [ Upstream commit f24ba846133d0edec785ac6430d4daf6e9c93a09 ] The "Downstream Port Containment related Enhancements" ECN of Jan 28, 2019 (document 12888 below), defined the EDR_PORT_DPC_ENABLE_DSM function with Revision ID 5 with Arg3 being an integer. But when the ECN was integrated into PCI Firmware r3.3, sec 4.6.12, it was defined as Revision ID 6 with Arg3 being a package containing an integer. The implementation in acpi_enable_dpc() supplies a package as Arg3 (arg4 in the code), but it previously specified Revision ID 5. Align this with PCI Firmware r3.3 by using Revision ID 6. If firmware implemented per the ECN, its Revision 5 function would receive a package as Arg3 when it expects an integer, so acpi_enable_dpc() would likely fail. If such firmware exists and lacks a Revision 6 function that expects a package, we may have to add support for Revision 5. Link: https://lore.kernel.org/r/20240501022543.1626025-1-sathyanarayanan.kuppuswamy@linux.intel.com Link: https://members.pcisig.com/wg/PCI-SIG/document/12888 Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support") Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> [bhelgaas: split into two patches, update commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Satish Thatchanamurthy <Satish.Thatchanamurt@Dell.com> # one platform Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:03:41 +02:00
Vidya Sagar	905ec77eda	PCI: tegra194: Fix probe path for Endpoint mode [ Upstream commit 19326006a21da26532d982254677c892dae8f29b ] Tegra194 PCIe probe path is taking failure path in success case for Endpoint mode. Return success from the switch case instead of going into the failure path. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Link: https://lore.kernel.org/linux-pci/20240408093053.3948634-1-vidyas@nvidia.com Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-06-12 11:03:35 +02:00
Johan Hovold	0f7908a016	PCI/ASPM: Fix deadlock when enabling ASPM commit 1e560864159d002b453da42bd2c13a1805515a20 upstream. A last minute revert in 6.7-final introduced a potential deadlock when enabling ASPM during probe of Qualcomm PCIe controllers as reported by lockdep: ============================================ WARNING: possible recursive locking detected 6.7.0 #40 Not tainted -------------------------------------------- kworker/u16:5/90 is trying to acquire lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pcie_aspm_pm_state_change+0x58/0xdc but task is already holding lock: ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(pci_bus_sem); lock(pci_bus_sem); * DEADLOCK * Call trace: print_deadlock_bug+0x25c/0x348 __lock_acquire+0x10a4/0x2064 lock_acquire+0x1e8/0x318 down_read+0x60/0x184 pcie_aspm_pm_state_change+0x58/0xdc pci_set_full_power_state+0xa8/0x114 pci_set_power_state+0xc4/0x120 qcom_pcie_enable_aspm+0x1c/0x3c [pcie_qcom] pci_walk_bus+0x64/0xbc qcom_pcie_host_post_init_2_7_0+0x28/0x34 [pcie_qcom] The deadlock can easily be reproduced on machines like the Lenovo ThinkPad X13s by adding a delay to increase the race window during asynchronous probe where another thread can take a write lock. Add a new pci_set_power_state_locked() and associated helper functions that can be called with the PCI bus semaphore held to avoid taking the read lock twice. Link: https://lore.kernel.org/r/ZZu0qx2cmn7IwTyQ@hovoldconsulting.com Link: https://lore.kernel.org/r/20240130100243.11011-1-johan+linaro@kernel.org Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"") Signed-off-by: Johan Hovold <johan+linaro@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> # 6.7 [bhelgaas: backported to v6.1.y, which contains b9c370b61d73 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()""), a backport of f93e71aea6c6. This omits the drivers/pci/controller/dwc/pcie-qcom.c hunk that updates qcom_pcie_enable_aspm(), which was added by 9f4f3dfad8cf ("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops"), which is not present in v6.1.87.] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-04-27 17:07:18 +02:00
Ilpo Järvinen	39f932d295	PCI: Simplify pcie_capability_clear_and_set_word() to ..._clear_word() [ Upstream commit 0fce6e5c87faec2c8bf28d2abc8cb595f4e244b6 ] When using pcie_capability_clear_and_set_word() but not actually setting anything, use pcie_capability_clear_word() instead. Link: https://lore.kernel.org/r/20231026121924.2164-1-ilpo.jarvinen@linux.intel.com Link: https://lore.kernel.org/r/20231026121924.2164-2-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: squash] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-27 17:07:11 +02:00
Bjorn Helgaas	636f0fdb36	PCI/DPC: Use FIELD_GET() [ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ] Use FIELD_GET() to remove dependencies on the field position, i.e., the shift value. No functional change intended. Link: https://lore.kernel.org/r/20231018113254.17616-5-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-27 17:07:11 +02:00
Kelvin Cao	bbdfa14482	PCI: switchtec: Add support for PCIe Gen5 devices [ Upstream commit 0fb53e64705ae0fabd9593102e0f0e6812968802 ] Advertise support of Gen5 devices in the driver's device ID table and add the same IDs for the switchtec quirks. Also update driver code to accommodate them. Link: https://lore.kernel.org/r/20230624000003.2315364-3-kelvin.cao@microchip.com Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-27 17:07:10 +02:00
Kelvin Cao	87709f7ecd	PCI: switchtec: Use normal comment style [ Upstream commit 846691f5483d61259db2f4d6a3dce8b98d518794 ] Use normal comment style '/* */' for device ID description. Link: https://lore.kernel.org/r/20230624000003.2315364-2-kelvin.cao@microchip.com Signed-off-by: Kelvin Cao <kelvin.cao@microchip.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Stable-dep-of: 0fb53e64705a ("PCI: switchtec: Add support for PCIe Gen5 devices") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-27 17:07:10 +02:00
Maciej W. Rozycki	89a9196aec	PCI: Execute quirk_enable_clear_retrain_link() earlier [ Upstream commit 07a8d698de50c4740ac6f709c43e23a6da6e4dbc ] Make quirk_enable_clear_retrain_link() an early quirk so that any later fixups can rely on dev->clear_retrain_link to have been already initialised. [bhelgaas: reorder to just before it becomes possible to call pcie_retrain_link() earlier] Link: https://lore.kernel.org/r/alpine.DEB.2.21.2305310049000.59226@angie.orcam.me.uk Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-27 17:07:10 +02:00
Mike Pastore	f4aae2afe2	PCI: Delay after FLR of Solidigm P44 Pro NVMe [ Upstream commit 0ac448e0d29d6ba978684b3fa2e3ac7294ec2475 ] Prevent KVM hang when a Solidgm P44 Pro NVMe is passed through to a guest via IOMMU and the guest is subsequently rebooted. A similar issue was identified and patched by 51ba09452d11 ("PCI: Delay after FLR of Intel DC P3700 NVMe") and the same fix can be applied for this case. (Intel spun off their NAND and SSD business as Solidigm and sold it to SK Hynix in late 2021.) Link: https://lore.kernel.org/r/20230507073519.9737-1-mike@oobak.org Signed-off-by: Mike Pastore <mike@oobak.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-27 17:07:10 +02:00
Alvaro Karsz	57aadcc028	PCI: Avoid FLR for SolidRun SNET DPU rev 1 [ Upstream commit d089d69cc1f824936eeaa4fa172f8fa1a0949eaa ] This patch fixes a FLR bug on the SNET DPU rev 1 by setting the PCI_DEV_FLAGS_NO_FLR_RESET flag. As there is a quirk to avoid FLR (quirk_no_flr), I added a new quirk to check the rev ID before calling to quirk_no_flr. Without this patch, a SNET DPU rev 1 may hang when FLR is applied. Signed-off-by: Alvaro Karsz <alvaro.karsz@solid-run.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Message-Id: <20230110165638.123745-3-alvaro.karsz@solid-run.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-27 17:07:09 +02:00
Michael Kelley	4732ac1c23	PCI: hv: Fix ring buffer size calculation [ Upstream commit b5ff74c1ef50fe08e384026875fec660fadfaedd ] For a physical PCI device that is passed through to a Hyper-V guest VM, current code specifies the VMBus ring buffer size as 4 pages. But this is an inappropriate dependency, since the amount of ring buffer space needed is unrelated to PAGE_SIZE. For example, on x86 the ring buffer size ends up as 16 Kbytes, while on ARM64 with 64 Kbyte pages, the ring size bloats to 256 Kbytes. The ring buffer for PCI pass-thru devices is used for only a few messages during device setup and removal, so any space above a few Kbytes is wasted. Fix this by declaring the ring buffer size to be a fixed 16 Kbytes. Furthermore, use the VMBUS_RING_SIZE() macro so that the ring buffer header is properly accounted for, and so the size is rounded up to a page boundary, using the page size for which the kernel is built. While w/64 Kbyte pages this results in a 64 Kbyte ring buffer header plus a 64 Kbyte ring buffer, that's the smallest possible with that page size. It's still 128 Kbytes better than the current code. Link: https://lore.kernel.org/linux-pci/20240216202240.251818-1-mhklinux@outlook.com Signed-off-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Reviewed-by: Ilpo Jarvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Long Li <longli@microsoft.com> Cc: <stable@vger.kernel.org> # 5.15.x Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-03 15:19:34 +02:00
Niklas Cassel	3d863cf207	PCI: dwc: endpoint: Fix advertised resizable BAR size [ Upstream commit 72e34b8593e08a0ee759b7a038e0b178418ea6f8 ] The commit message in commit fc9a77040b04 ("PCI: designware-ep: Configure Resizable BAR cap to advertise the smallest size") claims that it modifies the Resizable BAR capability to only advertise support for 1 MB size BARs. However, the commit writes all zeroes to PCI_REBAR_CAP (the register which contains the possible BAR sizes that a BAR be resized to). According to the spec, it is illegal to not have a bit set in PCI_REBAR_CAP, and 1 MB is the smallest size allowed. Set bit 4 in PCI_REBAR_CAP, so that we actually advertise support for a 1 MB BAR size. Before: Capabilities: [2e8 v1] Physical Resizable BAR BAR 0: current size: 1MB BAR 1: current size: 1MB BAR 2: current size: 1MB BAR 3: current size: 1MB BAR 4: current size: 1MB BAR 5: current size: 1MB After: Capabilities: [2e8 v1] Physical Resizable BAR BAR 0: current size: 1MB, supported: 1MB BAR 1: current size: 1MB, supported: 1MB BAR 2: current size: 1MB, supported: 1MB BAR 3: current size: 1MB, supported: 1MB BAR 4: current size: 1MB, supported: 1MB BAR 5: current size: 1MB, supported: 1MB Fixes: fc9a77040b04 ("PCI: designware-ep: Configure Resizable BAR cap to advertise the smallest size") Link: https://lore.kernel.org/linux-pci/20240307111520.3303774-1-cassel@kernel.org Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: <stable@vger.kernel.org> # 5.2 Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-03 15:19:34 +02:00
Manivannan Sadhasivam	04f3652bd0	PCI: qcom: Enable BDF to SID translation properly [ Upstream commit bf79e33cdd89db498e00a6131e937259de5f2705 ] Qcom SoCs making use of ARM SMMU require BDF to SID translation table in the driver to properly map the SID for the PCIe devices based on their BDF identifier. This is currently achieved with the help of qcom_pcie_config_sid_1_9_0() function for SoCs supporting the 1_9_0 config. But With newer Qcom SoCs starting from SM8450, BDF to SID translation is set to bypass mode by default in hardware. Due to this, the translation table that is set in the qcom_pcie_config_sid_1_9_0() is essentially unused and the default SID is used for all endpoints in SoCs starting from SM8450. This is a security concern and also warrants swapping the DeviceID in DT while using the GIC ITS to handle MSIs from endpoints. The swapping is currently done like below in DT when using GIC ITS: /* * MSIs for BDF (1:0.0) only works with Device ID 0x5980. * Hence, the IDs are swapped. */ msi-map = <0x0 &gic_its 0x5981 0x1>, <0x100 &gic_its 0x5980 0x1>; Here, swapping of the DeviceIDs ensure that the endpoint with BDF (1:0.0) gets the DeviceID 0x5980 which is associated with the default SID as per the iommu mapping in DT. So MSIs were delivered with IDs swapped so far. But this also means the Root Port (0:0.0) won't receive any MSIs (for PME, AER etc...) So let's fix these issues by clearing the BDF to SID bypass mode for all SoCs making use of the 1_9_0 config. This allows the PCIe devices to use the correct SID, thus avoiding the DeviceID swapping hack in DT and also achieving the isolation between devices. Fixes: 4c9398822106 ("PCI: qcom: Add support for configuring BDF to SID mapping for SM8250") Link: https://lore.kernel.org/linux-pci/20240307-pci-bdf-sid-fix-v1-1-9423a7e2d63c@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Cc: stable@vger.kernel.org # 5.11 Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-03 15:19:34 +02:00
Manivannan Sadhasivam	a601e7a7fc	PCI: qcom: Rename qcom_pcie_config_sid_sm8250() to reflect IP version [ Upstream commit 1f70939871b260b52e9d1941f1cad740b7295c2c ] qcom_pcie_config_sid_sm8250() function no longer applies only to SM8250. So let's rename it to reflect the actual IP version and also move its definition to keep it sorted as per IP revisions. Link: https://lore.kernel.org/r/20230316081117.14288-15-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Stable-dep-of: bf79e33cdd89 ("PCI: qcom: Enable BDF to SID translation properly") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-03 15:19:34 +02:00
Stanislaw Gruszka	3a342fa318	PCI/AER: Block runtime suspend when handling errors [ Upstream commit 002bf2fbc00e5c4b95fb167287e2ae7d1973281e ] PM runtime can be done simultaneously with AER error handling. Avoid that by using pm_runtime_get_sync() before and pm_runtime_put() after reset in pcie_do_recovery() for all recovering devices. pm_runtime_get_sync() will increase dev->power.usage_count counter to prevent any possible future request to runtime suspend a device. It will also resume a device, if it was previously in D3hot state. I tested with igc device by doing simultaneous aer_inject and rpm suspend/resume via /sys/bus/pci/devices/PCI_ID/power/control and can reproduce: igc 0000:02:00.0: not ready 65535ms after bus reset; giving up pcieport 0000:00:1c.2: AER: Root Port link has been reset (-25) pcieport 0000:00:1c.2: AER: subordinate device reset failed pcieport 0000:00:1c.2: AER: device recovery failed igc 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible The problem disappears when this patch is applied. Link: https://lore.kernel.org/r/20240212120135.146068-1-stanislaw.gruszka@linux.intel.com Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-03 15:19:33 +02:00
Paul Menzel	c9ef367b3e	PCI/DPC: Quirk PIO log size for Intel Raptor Lake Root Ports [ Upstream commit 627c6db20703b5d18d928464f411d0d4ec327508 ] Commit 5459c0b70467 ("PCI/DPC: Quirk PIO log size for certain Intel Root Ports") and commit 3b8803494a06 ("PCI/DPC: Quirk PIO log size for Intel Ice Lake Root Ports") add quirks for Ice, Tiger and Alder Lake Root Ports. System firmware for Raptor Lake still has the bug, so Linux logs the warning below on several Raptor Lake systems like Dell Precision 3581 with Intel Raptor Lake processor (0W18NX) system firmware/BIOS version 1.10.1. pci 0000:00:07.0: [8086:a76e] type 01 class 0x060400 pci 0000:00:07.0: DPC: RP PIO log size 0 is invalid pci 0000:00:07.1: [8086:a73f] type 01 class 0x060400 pci 0000:00:07.1: DPC: RP PIO log size 0 is invalid Apply the quirk for Raptor Lake Root Ports as well. This also enables the DPC driver to dump the RP PIO Log registers when DPC is triggered. Link: https://lore.kernel.org/r/20240305113057.56468-1-pmenzel@molgen.mpg.de Reported-by: Niels van Aert <nvaert1986@hotmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218560 Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> Cc: Mika Westerberg <mika.westerberg@linux.intel.com> Cc: Niels van Aert <nvaert1986@hotmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-03 15:19:31 +02:00
Rafael J. Wysocki	900b81caf0	PCI/PM: Drain runtime-idle callbacks before driver removal [ Upstream commit 9d5286d4e7f68beab450deddbb6a32edd5ecf4bf ] A race condition between the .runtime_idle() callback and the .remove() callback in the rtsx_pcr PCI driver leads to a kernel crash due to an unhandled page fault [1]. The problem is that rtsx_pci_runtime_idle() is not expected to be running after pm_runtime_get_sync() has been called, but the latter doesn't really guarantee that. It only guarantees that the suspend and resume callbacks will not be running when it returns. However, if a .runtime_idle() callback is already running when pm_runtime_get_sync() is called, the latter will notice that the runtime PM status of the device is RPM_ACTIVE and it will return right away without waiting for the former to complete. In fact, it cannot wait for .runtime_idle() to complete because it may be called from that callback (it arguably does not make much sense to do that, but it is not strictly prohibited). Thus in general, whoever is providing a .runtime_idle() callback needs to protect it from running in parallel with whatever code runs after pm_runtime_get_sync(). [Note that .runtime_idle() will not start after pm_runtime_get_sync() has returned, but it may continue running then if it has started earlier.] One way to address that race condition is to call pm_runtime_barrier() after pm_runtime_get_sync() (not before it, because a nonzero value of the runtime PM usage counter is necessary to prevent runtime PM callbacks from being invoked) to wait for the .runtime_idle() callback to complete should it be running at that point. A suitable place for doing that is in pci_device_remove() which calls pm_runtime_get_sync() before removing the driver, so it may as well call pm_runtime_barrier() subsequently, which will prevent the race in question from occurring, not just in the rtsx_pcr driver, but in any PCI drivers providing .runtime_idle() callbacks. Link: https://lore.kernel.org/lkml/20240229062201.49500-1-kai.heng.feng@canonical.com/ # [1] Link: https://lore.kernel.org/r/5761426.DvuYhMxLoT@kreacher Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Ricky Wu <ricky_wu@realtek.com> Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Cc: <stable@vger.kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-04-03 15:19:31 +02:00
Jörg Wedekind	ce106d8ef0	PCI: Mark 3ware-9650SE Root Port Extended Tags as broken [ Upstream commit baf67aefbe7d7deafa59ca49612d163f8889934c ] Per PCIe r6.1, sec 2.2.6.2 and 7.5.3.4, a Requester may not use 8-bit Tags unless its Extended Tag Field Enable is set, but all Receivers/Completers must handle 8-bit Tags correctly regardless of their Extended Tag Field Enable. Some devices do not handle 8-bit Tags as Completers, so add a quirk for them. If we find such a device, we disable Extended Tags for the entire hierarchy to make peer-to-peer DMA possible. The 3ware 9650SE seems to have issues with handling 8-bit tags. Mark it as broken. This fixes PCI Parity Errors like : 3w-9xxx: scsi0: ERROR: (0x06:0x000C): PCI Parity Error: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x000D): PCI Abort: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x000E): Controller Queue Error: clearing. 3w-9xxx: scsi0: ERROR: (0x06:0x0010): Microcontroller Error: clearing. Link: https://lore.kernel.org/r/20240219132811.8351-1-joerg@wedekind.de Fixes: 60db3a4d8cc9 ("PCI: Enable PCIe Extended Tags if supported") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=202425 Signed-off-by: Jörg Wedekind <joerg@wedekind.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-26 18:20:50 -04:00
Yang Yingliang	6632a54ac8	NTB: fix possible name leak in ntb_register_device() [ Upstream commit aebfdfe39b9327a3077d0df8db3beb3160c9bdd0 ] If device_register() fails in ntb_register_device(), the device name allocated by dev_set_name() should be freed. As per the comment in device_register(), callers should use put_device() to give up the reference in the error path. So fix this by calling put_device() in the error path so that the name can be freed in kobject_cleanup(). As a result of this, put_device() in the error path of ntb_register_device() is removed and the actual error is returned. Fixes: a1bd3baeb2f1 ("NTB: Add NTB hardware abstraction layer") Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20231201033057.1399131-1-yangyingliang@huaweicloud.com [mani: reworded commit message] Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-26 18:20:49 -04:00
ruanjinjie	298f7f1370	NTB: EPF: fix possible memory leak in pci_vntb_probe() [ Upstream commit 956578e3d397e00d6254dc7b5194d28587f98518 ] As ntb_register_device() don't handle error of device_register(), if ntb_register_device() returns error in pci_vntb_probe(), name of kobject which is allocated in dev_set_name() called in device_add() is leaked. As comment of device_add() says, it should call put_device() to drop the reference count that was set in device_initialize() when it fails, so the name can be freed in kobject_cleanup(). Signed-off-by: ruanjinjie <ruanjinjie@huawei.com> Signed-off-by: Jon Mason <jdmason@kudzu.us> Stable-dep-of: aebfdfe39b93 ("NTB: fix possible name leak in ntb_register_device()") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-26 18:20:48 -04:00
Christophe JAILLET	4421c74602	PCI: switchtec: Fix an error handling path in switchtec_pci_probe() [ Upstream commit dec529b0b0572b32f9eb91c882dd1f08ca657efb ] The commit in Fixes changed the logic on how resources are released and introduced a new switchtec_exit_pci() that need to be called explicitly in order to undo a corresponding switchtec_init_pci(). This was done in the remove function, but not in the probe. Fix the probe now. Fixes: df25461119d9 ("PCI: switchtec: Fix stdev_release() crash after surprise hot remove") Link: https://lore.kernel.org/r/01446d2ccb91a578239915812f2b7dfbeb2882af.1703428183.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-26 18:20:47 -04:00
Ilpo Järvinen	43f4364c8f	PCI/DPC: Print all TLP Prefixes, not just the first [ Upstream commit 6568d82512b0a64809acff3d7a747362fa4288c8 ] The TLP Prefix Log Register consists of multiple DWORDs (PCIe r6.1 sec 7.9.14.13) but the loop in dpc_process_rp_pio_error() keeps reading from the first DWORD, so we print only the first PIO TLP Prefix (duplicated several times), and we never print the second, third, etc., Prefixes. Add the iteration count based offset calculation into the config read. Fixes: f20c4ea49ec4 ("PCI/DPC: Add eDPC support") Link: https://lore.kernel.org/r/20240118110815.3867-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> [bhelgaas: add user-visible details to commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-26 18:20:45 -04:00
Ethan Zhao	48fba9d7f5	PCI: Make pci_dev_is_disconnected() helper public for other drivers [ Upstream commit 39714fd73c6b60a8d27bcc5b431afb0828bf4434 ] Make pci_dev_is_disconnected() public so that it can be called from Intel VT-d driver to quickly fix/workaround the surprise removal unplug hang issue for those ATS capable devices on PCIe switch downstream hotplug capable ports. Beside pci_device_is_present() function, this one has no config space space access, so is light enough to optimize the normal pure surprise removal and safe removal flow. Acked-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Tested-by: Haorong Ye <yehaorong@bytedance.com> Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com> Link: https://lore.kernel.org/r/20240301080727.3529832-2-haifeng.zhao@linux.intel.com Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com> Signed-off-by: Joerg Roedel <jroedel@suse.de> Stable-dep-of: 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected") Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-26 18:20:39 -04:00
Xiaowei Bao	507eeaad4d	PCI: layerscape: Add workaround for lost link capabilities during reset [ Upstream commit 17cf8661ee0f065c08152e611a568dd1fb0285f1 ] The endpoint controller loses the Maximum Link Width and Supported Link Speed value from the Link Capabilities Register - initially configured by the Reset Configuration Word (RCW) - during a link-down or hot reset event. Address this issue in the endpoint event handler. Link: https://lore.kernel.org/r/20230720135834.1977616-2-Frank.Li@nxp.com Fixes: a805770d8a22 ("PCI: layerscape: Add EP mode support") Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Hou Zhiqiang <Zhiqiang.Hou@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Acked-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-06 14:45:05 +00:00
Frank Li	e30f82597b	PCI: layerscape: Add the endpoint linkup notifier support [ Upstream commit 061cbfab09fb35898f2907d42f936cf9ae271d93 ] Layerscape has PME interrupt, which can be used as linkup notifier. Set CFG_READY bit of PEX_PF0_CONFIG to enable accesses from root complex when linkup detected. Link: https://lore.kernel.org/r/20230515151049.2797105-1-Frank.Li@nxp.com Signed-off-by: Xiaowei Bao <xiaowei.bao@nxp.com> Signed-off-by: Frank Li <Frank.Li@nxp.com> Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-03-06 14:45:05 +00:00
Vidya Sagar	2a19e0042b	PCI/MSI: Prevent MSI hardware interrupt number truncation commit db744ddd59be798c2627efbfc71f707f5a935a40 upstream. While calculating the hardware interrupt number for a MSI interrupt, the higher bits (i.e. from bit-5 onwards a.k.a domain_nr >= 32) of the PCI domain number gets truncated because of the shifted value casting to return type of pci_domain_nr() which is 'int'. This for example is resulting in same hardware interrupt number for devices 0019:00:00.0 and 0039:00:00.0. To address this cast the PCI domain number to 'irq_hw_number_t' before left shifting it to calculate the hardware interrupt number. Please note that this fixes the issue only on 64-bit systems and doesn't change the behavior for 32-bit systems i.e. the 32-bit systems continue to have the issue. Since the issue surfaces only if there are too many PCIe controllers in the system which usually is the case in modern server systems and they don't tend to run 32-bit kernels. Fixes: 3878eaefb89a ("PCI/MSI: Enhance core to support hierarchy irqdomain") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240115135649.708536-1-vidyas@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-03-01 13:26:33 +01:00
Dan Carpenter	6967ddd378	PCI: dwc: Fix a 64bit bug in dw_pcie_ep_raise_msix_irq() commit b5d1b4b46f856da1473c7ba9a5cdfcb55c9b2478 upstream. The "msg_addr" variable is u64. However, the "aligned_offset" is an unsigned int. This means that when the code does: msg_addr &= ~aligned_offset; it will unintentionally zero out the high 32 bits. Use ALIGN_DOWN() to do the alignment instead. Fixes: 2217fffcd63f ("PCI: dwc: endpoint: Fix dw_pcie_ep_raise_msix_irq() alignment support") Link: https://lore.kernel.org/r/af59c7ad-ab93-40f7-ad4a-7ac0b14d37f5@moroto.mountain Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: <stable@vger.kernel.org> Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-03-01 13:26:25 +01:00
Bjorn Helgaas	fc557b76dc	PCI/AER: Decode Requester ID when no error info found [ Upstream commit 1291b716bbf969e101d517bfb8ba18d958f758b8 ] When a device with AER detects an error, it logs error information in its own AER Error Status registers. It may send an Error Message to the Root Port (RCEC in the case of an RCiEP), which logs the fact that an Error Message was received (Root Error Status) and the Requester ID of the message source (Error Source Identification). aer_print_port_info() prints the Requester ID from the Root Port Error Source in the usual Linux "bb:dd.f" format, but when find_source_device() finds no error details in the hierarchy below the Root Port, it printed the raw Requester ID without decoding it. Decode the Requester ID in the usual Linux format so it matches other messages. Sample message changes: - pcieport 0000:00:1c.5: AER: Correctable error received: 0000:00:1c.5 - pcieport 0000:00:1c.5: AER: can't find device of ID00e5 + pcieport 0000:00:1c.5: AER: Correctable error message received from 0000:00:1c.5 + pcieport 0000:00:1c.5: AER: found no error details for 0000:00:1c.5 Link: https://lore.kernel.org/r/20231206224231.732765-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-02-05 20:12:59 +00:00
Ilpo Järvinen	83c895561a	PCI: Fix 64GT/s effective data rate calculation [ Upstream commit ac4f1897fa5433a1b07a625503a91b6aa9d7e643 ] Unlike the lower rates, the PCIe 64GT/s Data Rate uses 1b/1b encoding, not 128b/130b (PCIe r6.1 sec 1.2, Table 1-1). Correct the PCIE_SPEED2MBS_ENC() calculation to reflect that. Link: https://lore.kernel.org/r/20240102172701.65501-1-ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-02-05 20:12:58 +00:00
Daniel Stodden	1d83c85922	PCI: switchtec: Fix stdev_release() crash after surprise hot remove [ Upstream commit df25461119d987b8c81d232cfe4411e91dcabe66 ] A PCI device hot removal may occur while stdev->cdev is held open. The call to stdev_release() then happens during close or exit, at a point way past switchtec_pci_remove(). Otherwise the last ref would vanish with the trailing put_device(), just before return. At that later point in time, the devm cleanup has already removed the stdev->mmio_mrpc mapping. Also, the stdev->pdev reference was not a counted one. Therefore, in DMA mode, the iowrite32() in stdev_release() will cause a fatal page fault, and the subsequent dma_free_coherent(), if reached, would pass a stale &stdev->pdev->dev pointer. Fix by moving MRPC DMA shutdown into switchtec_pci_remove(), after stdev_kill(). Counting the stdev->pdev ref is now optional, but may prevent future accidents. Reproducible via the script at https://lore.kernel.org/r/20231113212150.96410-1-dns@arista.com Link: https://lore.kernel.org/r/20231122042316.91208-2-dns@arista.com Signed-off-by: Daniel Stodden <dns@arista.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Dmitry Safonov <dima@arista.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-02-05 20:12:58 +00:00
Guilherme G. Piccoli	5e0160dab1	PCI: Only override AMD USB controller if required [ Upstream commit e585a37e5061f6d5060517aed1ca4ccb2e56a34c ] By running a Van Gogh device (Steam Deck), the following message was noticed in the kernel log: pci 0000:04:00.3: PCI class overridden (0x0c03fe -> 0x0c03fe) so dwc3 driver can claim this instead of xhci Effectively this means the quirk executed but changed nothing, since the class of this device was already the proper one (likely adjusted by newer firmware versions). Check and perform the override only if necessary. Link: https://lore.kernel.org/r/20231120160531.361552-1-gpiccoli@igalia.com Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Huang Rui <ray.huang@amd.com> Cc: Vicki Pfau <vi@endrift.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-02-05 20:12:58 +00:00
Ido Schimmel	c0d5a69322	PCI: Add no PM reset quirk for NVIDIA Spectrum devices [ Upstream commit 3ed48c80b28d8dcd584d6ddaf00c75b7673e1a05 ] Spectrum-{1,2,3,4} devices report that a D3hot->D0 transition causes a reset (i.e., they advertise NoSoftRst-). However, this transition does not have any effect on the device: It continues to be operational and network ports remain up. Advertising this support makes it seem as if a PM reset is viable for these devices. Mark it as unavailable to skip it when testing reset methods. Before: # cat /sys/bus/pci/devices/0000\:03\:00.0/reset_method pm bus After: # cat /sys/bus/pci/devices/0000\:03\:00.0/reset_method bus Signed-off-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Petr Machata <petrm@nvidia.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-02-05 20:12:50 +00:00
Jianjun Wang	69f0bebe91	PCI: mediatek-gen3: Fix translation window size calculation [ Upstream commit 9ccc1318cf4bd90601f221268e42c3374703d681 ] When using the fls() helper, the translation table should be a power of two; otherwise, the resulting value will not be correct. For example, given fls(0x3e00000) - 1 = 25, the PCIe translation window size will be set to 0x2000000 instead of the expected size 0x3e00000. Fix the translation window by splitting the MMIO space into multiple tables if its size is not a power of two. [kwilczynski: commit log] Link: https://lore.kernel.org/linux-pci/20231023081423.18559-1-jianjun.wang@mediatek.com Fixes: d3bf75b579b9 ("PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192") Signed-off-by: Jianjun Wang <jianjun.wang@mediatek.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-25 15:27:48 -08:00
Siddharth Vadapalli	94667790e5	PCI: keystone: Fix race condition when initializing PHYs [ Upstream commit c12ca110c613a81cb0f0099019c839d078cd0f38 ] The PCI driver invokes the PHY APIs using the ks_pcie_enable_phy() function. The PHY in this case is the Serdes. It is possible that the PCI instance is configured for two lane operation across two different Serdes instances, using one lane of each Serdes. In such a configuration, if the reference clock for one Serdes is provided by the other Serdes, it results in a race condition. After the Serdes providing the reference clock is initialized by the PCI driver by invoking its PHY APIs, it is not guaranteed that this Serdes remains powered on long enough for the PHY APIs based initialization of the dependent Serdes. In such cases, the PLL of the dependent Serdes fails to lock due to the absence of the reference clock from the former Serdes which has been powered off by the PM Core. Fix this by obtaining reference to the PHYs before invoking the PHY initialization APIs and releasing reference after the initialization is complete. Link: https://lore.kernel.org/linux-pci/20230927041845.1222080-1-s-vadapalli@ti.com Fixes: 49229238ab47 ("PCI: keystone: Cleanup PHY handling") Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Acked-by: Ravi Gunasekaran <r-gunasekaran@ti.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2024-01-25 15:27:48 -08:00
qizhong cheng	88f4dd8b9f	PCI: mediatek: Clear interrupt status before dispatching handler commit 4e11c29873a8a296a20f99b3e03095e65ebf897d upstream. We found a failure when using the iperf tool during WiFi performance testing, where some MSIs were received while clearing the interrupt status, and these MSIs cannot be serviced. The interrupt status can be cleared even if the MSI status remains pending. As such, given the edge-triggered interrupt type, its status should be cleared before being dispatched to the handler of the underling device. [kwilczynski: commit log, code comment wording] Link: https://lore.kernel.org/linux-pci/20231211094923.31967-1-jianjun.wang@mediatek.com Fixes: 43e6409db64d ("PCI: mediatek: Add MSI support for MT2712 and MT7622") Signed-off-by: qizhong cheng <qizhong.cheng@mediatek.com> Signed-off-by: Jianjun Wang <jianjun.wang@mediatek.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> [bhelgaas: rewrap comment] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-01-25 15:27:44 -08:00
Niklas Cassel	0c883bc9fa	PCI: dwc: endpoint: Fix dw_pcie_ep_raise_msix_irq() alignment support commit 2217fffcd63f86776c985d42e76daa43a56abdf1 upstream. Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to support iATUs which require a specific alignment. However, this support cannot have been properly tested. The whole point is for the iATU to map an address that is aligned, using dw_pcie_ep_map_addr(), and then let the writel() write to ep->msi_mem + aligned_offset. Thus, modify the address that is mapped such that it is aligned. With this change, dw_pcie_ep_raise_msix_irq() matches the logic in dw_pcie_ep_raise_msi_irq(). Link: https://lore.kernel.org/linux-pci/20231128132231.2221614-1-nks@flawful.org Fixes: 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address") Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: stable@vger.kernel.org # 5.7 Cc: Kishon Vijay Abraham I <kishon@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-01-25 15:27:44 -08:00
LeoLiuoc	d08e756e25	PCI: Add ACS quirk for more Zhaoxin Root Ports commit e367e3c765f5477b2e79da0f1399aed49e2d1e37 upstream. Add more Root Port Device IDs to pci_quirk_zhaoxin_pcie_ports_acs() for some new Zhaoxin platforms. Fixes: 299bd044a6f3 ("PCI: Add ACS quirk for Zhaoxin Root/Downstream Ports") Link: https://lore.kernel.org/r/20231211091543.735903-1-LeoLiu-oc@zhaoxin.com Signed-off-by: LeoLiuoc <LeoLiu-oc@zhaoxin.com> [bhelgaas: update subject, drop changelog, add Fixes, add stable tag, fix whitespace, wrap code comment] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> # 5.7 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-01-20 11:50:11 +01:00
Bjorn Helgaas	b9c370b61d	Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()" commit f93e71aea6c60ebff8adbd8941e678302d377869 upstream. This reverts commit 08d0cc5f34265d1a1e3031f319f594bd1970976c. Michael reported that when attempting to resume from suspend to RAM on ASUS mini PC PN51-BB757MDE1 (DMI model: MINIPC PN51-E1), 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") caused a 12-second delay with no output, followed by a reboot. Workarounds include: - Reverting 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") - Booting with "pcie_aspm=off" - Booting with "pcie_aspm.policy=performance" - "echo 0 \| sudo tee /sys/bus/pci/devices/0000:03:00.0/link/l1_aspm" before suspending - Connecting a USB flash drive Link: https://lore.kernel.org/r/20240102232550.1751655-1-helgaas@kernel.org Fixes: 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()") Reported-by: Michael Schaller <michael@5challer.de> Link: https://lore.kernel.org/r/76c61361-b8b4-435f-a9f1-32b716763d62@5challer.de Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-01-10 17:10:20 +01:00
Jiaxun Yang	0c196180b5	PCI: loongson: Limit MRRS to 256 commit ef61a0405742a9f7f6051bc6fd2f017d87d07911 upstream. This is a partial revert of 8b3517f88ff2 ("PCI: loongson: Prevent LS7A MRRS increases") for MIPS-based Loongson. Some MIPS Loongson systems don't support arbitrary Max_Read_Request_Size (MRRS) settings. 8b3517f88ff2 ("PCI: loongson: Prevent LS7A MRRS increases") worked around that by (1) assuming that firmware configured MRRS to the maximum supported value and (2) preventing the PCI core from increasing MRRS. Unfortunately, some firmware doesn't set that maximum MRRS correctly, which results in devices not being initialized correctly. One symptom, from the Debian report below, is this: ata4.00: exception Emask 0x0 SAct 0x20000000 SErr 0x0 action 0x6 frozen ata4.00: failed command: WRITE FPDMA QUEUED ata4.00: cmd 61/20:e8:00:f0:e1/00:00:00:00:00/40 tag 29 ncq dma 16384 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) ata4.00: status: { DRDY } ata4: hard resetting link Limit MRRS to 256 because MIPS Loongson with higher MRRS support is considered rare. This must be done at device enablement stage because the MRRS setting may get lost if PCI_COMMAND_MASTER on the parent bridge is cleared, and we are only sure parent bridge is enabled at this point. Fixes: 8b3517f88ff2 ("PCI: loongson: Prevent LS7A MRRS increases") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217680 Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1035587 Link: https://lore.kernel.org/r/20231201115028.84351-1-jiaxun.yang@flygoat.com Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Huacai Chen <chenhuacai@loongson.cn> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:00:20 +01:00
Bjorn Helgaas	56d1891594	Revert "PCI: acpiphp: Reassign resources on bridge if necessary" commit 5df12742b7e3aae2594a30a9d14d5d6e9e7699f4 upstream. This reverts commit 40613da52b13fb21c5566f10b287e0ca8c12c4e9 and the subsequent fix to it: cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") 40613da52b13 fixed a problem where hot-adding a device with large BARs failed if the bridge windows programmed by firmware were not large enough. cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") fixed a problem with 40613da52b13: an ACPI hot-add of a device on a PCI root bus (common in the virt world) or firmware sending ACPI Bus Check to non-existent Root Ports (e.g., on Dell Inspiron 7352/0W6WV0) caused a NULL pointer dereference and suspend/resume hangs. Unfortunately the combination of 40613da52b13 and cc22522fd55e caused other problems: - Fiona reported that hot-add of SCSI disks in QEMU virtual machine fails sometimes. - Dongli reported a similar problem with hot-add of SCSI disks. - Jonathan reported a console freeze during boot on bare metal due to an error in radeon GPU initialization. Revert both patches to avoid adding these problems. This means we will again see the problems with hot-adding devices with large BARs and the NULL pointer dereferences and suspend/resume issues that 40613da52b13 and cc22522fd55e were intended to fix. Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary") Fixes: cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus") Reported-by: Fiona Ebner <f.ebner@proxmox.com> Closes: https://lore.kernel.org/r/9eb669c0-d8f2-431d-a700-6da13053ae54@proxmox.com Reported-by: Dongli Zhang <dongli.zhang@oracle.com> Closes: https://lore.kernel.org/r/3c4a446a-b167-11b8-f36f-d3c1b49b42e9@oracle.com Reported-by: Jonathan Woithe <jwoithe@just42.net> Closes: https://lore.kernel.org/r/ZXpaNCLiDM+Kv38H@marvin.atrad.com.au Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Igor Mammedov <imammedo@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-12-20 17:00:20 +01:00
Manivannan Sadhasivam	5bc8d96fed	PCI: qcom-ep: Add dedicated callback for writing to DBI2 registers [ Upstream commit a07d2497ed657eb2efeb967af47e22f573dcd1d6 ] The DWC core driver exposes the write_dbi2() callback for writing to the DBI2 registers in a vendor-specific way. On the Qcom EP platforms, the DBI_CS2 bit in the ELBI region needs to be asserted before writing to any DBI2 registers and deasserted once done. So, let's implement the callback for the Qcom PCIe EP driver so that the DBI2 writes are correctly handled in the hardware. Without this callback, the DBI2 register writes like BAR size won't go through and as a result, the default BAR size is set for all BARs. [kwilczynski: commit log, renamed function to match the DWC convention] Fixes: f55fee56a631 ("PCI: qcom-ep: Add Qualcomm PCIe Endpoint controller driver") Suggested-by: Serge Semin <fancer.lancer@gmail.com> Link: https://lore.kernel.org/linux-pci/20231025130029.74693-2-manivannan.sadhasivam@linaro.org Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Reviewed-by: Serge Semin <fancer.lancer@gmail.com> Cc: stable@vger.kernel.org # 5.16+ Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:18 +01:00
Lukas Wunner	1c8f75ee92	PCI: Lengthen reset delay for VideoPropulsion Torrent QN16e card [ Upstream commit c9260693aa0c1e029ed23693cfd4d7814eee6624 ] Commit ac91e6980563 ("PCI: Unify delay handling for reset and resume") shortened an unconditional 1 sec delay after a Secondary Bus Reset to 100 msec for PCIe (per PCIe r6.1 sec 6.6.1). The 1 sec delay is only required for Conventional PCI. But it turns out that there are PCIe devices which require a longer delay than prescribed before first config space access after reset recovery or resume from D3cold: Chad reports that a "VideoPropulsion Torrent QN16e" MPEG QAM Modulator "raises a PCI system error (PERR), as reported by the IPMI event log, and the hardware itself would suffer a catastrophic event, cycling the server" unless the longer delay is observed. The card is specified to conform to PCIe r1.0 and indeed only supports Gen1 speed (2.5 GT/s) according to lspci. PCIe r1.0 sec 7.6 prescribes the same 100 msec delay as PCIe r6.1 sec 6.6.1: To allow components to perform internal initialization, system software must wait for at least 100 ms from the end of a reset (cold/warm/hot) before it is permitted to issue Configuration Requests The behavior of the Torrent QN16e card thus appears to be a quirk. Treat it as such and lengthen the reset delay for this specific device. Fixes: ac91e6980563 ("PCI: Unify delay handling for reset and resume") Link: https://lore.kernel.org/r/47727e792c7f0282dc144e3ec8ce8eb6e713394e.1695304512.git.lukas@wunner.de Reported-by: Chad Schroeder <CSchroeder@sonifi.com> Closes: https://lore.kernel.org/linux-pci/DM6PR16MB2844903E34CAB910082DF019B1FAA@DM6PR16MB2844.namprd16.prod.outlook.com/ Tested-by: Chad Schroeder <CSchroeder@sonifi.com> Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.4+ Signed-off-by: Sasha Levin <sashal@kernel.org>	2023-12-08 08:51:18 +01:00
Uwe Kleine-König	efd8e6d19c	PCI: exynos: Don't discard .remove() callback commit 83a939f0fdc208ff3639dd3d42ac9b3c35607fd2 upstream. With CONFIG_PCI_EXYNOS=y and exynos_pcie_remove() marked with __exit, the function is discarded from the driver. In this case a bound device can still get unbound, e.g via sysfs. Then no cleanup code is run resulting in resource leaks or worse. The right thing to do is do always have the remove callback available. This fixes the following warning by modpost: WARNING: modpost: drivers/pci/controller/dwc/pci-exynos: section mismatch in reference: exynos_pcie_driver+0x8 (section: .data) -> exynos_pcie_remove (section: .exit.text) (with ARCH=x86_64 W=1 allmodconfig). Fixes: 340cba6092c2 ("pci: Add PCIe driver for Samsung Exynos") Link: https://lore.kernel.org/r/20231001170254.2506508-2-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-11-28 17:07:11 +00:00
Uwe Kleine-König	75bf9a8b0e	PCI: kirin: Don't discard .remove() callback commit 3064ef2e88c1629c1e67a77d7bc20020b35846f2 upstream. With CONFIG_PCIE_KIRIN=y and kirin_pcie_remove() marked with __exit, the function is discarded from the driver. In this case a bound device can still get unbound, e.g via sysfs. Then no cleanup code is run resulting in resource leaks or worse. The right thing to do is do always have the remove callback available. This fixes the following warning by modpost: drivers/pci/controller/dwc/pcie-kirin: section mismatch in reference: kirin_pcie_driver+0x8 (section: .data) -> kirin_pcie_remove (section: .exit.text) (with ARCH=x86_64 W=1 allmodconfig). Fixes: 000f60db784b ("PCI: kirin: Add support for a PHY layer") Link: https://lore.kernel.org/r/20231001170254.2506508-3-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-11-28 17:07:11 +00:00
Heiner Kallweit	e02b9c6a83	PCI/ASPM: Fix L1 substate handling in aspm_attr_store_common() commit 8e37372ad0bea4c9b4712d9943f6ae96cff9491f upstream. aspm_attr_store_common(), which handles sysfs control of ASPM, has the same problem as fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1"): disabling L1 adds only ASPM_L1 (but not any of the L1.x substates) to the "aspm_disable" mask. Enabling one substate, e.g., L1.1, via sysfs removes ASPM_L1 from the disable mask. Since disabling L1 via sysfs doesn't add any of the substates to the disable mask, enabling L1.1 actually enables all the substates. In this scenario: - Write 0 to "l1_aspm" to disable L1 - Write 1 to "l1_1_aspm" to enable L1.1 the intention is to disable L1 and all L1.x substates, then enable just L1.1, but in fact, all L1.x substates are enabled. Fix this by explicitly disabling all the L1.x substates when disabling L1. Fixes: 72ea91afbfb0 ("PCI/ASPM: Add sysfs attributes for controlling ASPM link states") Link: https://lore.kernel.org/r/6ba7dd79-9cfe-4ed0-a002-d99cb842f361@gmail.com Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-11-28 17:07:11 +00:00
Uwe Kleine-König	938c4c7318	PCI: keystone: Don't discard .probe() callback commit 7994db905c0fd692cf04c527585f08a91b560144 upstream. The __init annotation makes the ks_pcie_probe() function disappear after booting completes. However a device can also be bound later. In that case, we try to call ks_pcie_probe(), but the backing memory is likely already overwritten. The right thing to do is do always have the probe callback available. Note that the (wrong) __refdata annotation prevented this issue to be noticed by modpost. Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver") Link: https://lore.kernel.org/r/20231001170254.2506508-5-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-11-28 17:07:10 +00:00
Uwe Kleine-König	b7d27cbfef	PCI: keystone: Don't discard .remove() callback commit 200bddbb3f5202bbce96444fdc416305de14f547 upstream. With CONFIG_PCIE_KEYSTONE=y and ks_pcie_remove() marked with __exit, the function is discarded from the driver. In this case a bound device can still get unbound, e.g via sysfs. Then no cleanup code is run resulting in resource leaks or worse. The right thing to do is do always have the remove callback available. Note that this driver cannot be compiled as a module, so ks_pcie_remove() was always discarded before this change and modpost couldn't warn about this issue. Furthermore the __ref annotation also prevents a warning. Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver") Link: https://lore.kernel.org/r/20231001170254.2506508-4-u.kleine-koenig@pengutronix.de Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-11-28 17:07:10 +00:00
Lukas Wunner	4e0fbf3188	PCI/sysfs: Protect driver's D3cold preference from user space commit 70b70a4307cccebe91388337b1c85735ce4de6ff upstream. struct pci_dev contains two flags which govern whether the device may suspend to D3cold: * no_d3cold provides an opt-out for drivers (e.g. if a device is known to not wake from D3cold) * d3cold_allowed provides an opt-out for user space (default is true, user space may set to false) Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), the user space setting overwrites the driver setting. Essentially user space is trusted to know better than the driver whether D3cold is working. That feels unsafe and wrong. Assume that the change was introduced inadvertently and do not overwrite no_d3cold when d3cold_allowed is modified. Instead, consider d3cold_allowed in addition to no_d3cold when choosing a suspend state for the device. That way, user space may opt out of D3cold if the driver hasn't, but it may no longer force an opt in if the driver has opted out. Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Link: https://lore.kernel.org/r/b8a7f4af2b73f6b506ad8ddee59d747cbf834606.1695025365.git.lukas@wunner.de Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2023-11-28 17:07:08 +00:00

1 2 3 4 5 ...

9911 Commits