IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 9eee5330656bf92f51cb1f09b2dc9f8cf975b3d1 upstream.
KFENCE reports the following UAF:
BUG: KFENCE: use-after-free read in __pci_enable_msi_range+0x2c0/0x488
Use-after-free read at 0x0000000024629571 (in kfence-#12):
__pci_enable_msi_range+0x2c0/0x488
pci_alloc_irq_vectors_affinity+0xec/0x14c
pci_alloc_irq_vectors+0x18/0x28
kfence-#12: 0x0000000008614900-0x00000000e06c228d, size=104, cache=kmalloc-128
allocated by task 81 on cpu 7 at 10.808142s:
__kmem_cache_alloc_node+0x1f0/0x2bc
kmalloc_trace+0x44/0x138
msi_alloc_desc+0x3c/0x9c
msi_domain_insert_msi_desc+0x30/0x78
msi_setup_msi_desc+0x13c/0x184
__pci_enable_msi_range+0x258/0x488
pci_alloc_irq_vectors_affinity+0xec/0x14c
pci_alloc_irq_vectors+0x18/0x28
freed by task 81 on cpu 7 at 10.811436s:
msi_domain_free_descs+0xd4/0x10c
msi_domain_free_locked.part.0+0xc0/0x1d8
msi_domain_alloc_irqs_all_locked+0xb4/0xbc
pci_msi_setup_msi_irqs+0x30/0x4c
__pci_enable_msi_range+0x2a8/0x488
pci_alloc_irq_vectors_affinity+0xec/0x14c
pci_alloc_irq_vectors+0x18/0x28
Descriptor allocation done in:
__pci_enable_msi_range
msi_capability_init
msi_setup_msi_desc
msi_insert_msi_desc
msi_domain_insert_msi_desc
msi_alloc_desc
...
Freed in case of failure in __msi_domain_alloc_locked()
__pci_enable_msi_range
msi_capability_init
pci_msi_setup_msi_irqs
msi_domain_alloc_irqs_all_locked
msi_domain_alloc_locked
__msi_domain_alloc_locked => fails
msi_domain_free_locked
...
That failure propagates back to pci_msi_setup_msi_irqs() in
msi_capability_init() which accesses the descriptor for unmasking in the
error exit path.
Cure it by copying the descriptor and using the copy for the error exit path
unmask operation.
[ tglx: Massaged change log ]
Fixes: bf6e054e0e3f ("genirq/msi: Provide msi_device_populate/destroy_sysfs()")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Heelgas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240624203729.1094506-1-smostafa@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 6613443ffc49d03e27f0404978f685c4eac43fba ]
On runtime resume, pci_dev_wait() is called:
pci_pm_runtime_resume()
pci_pm_bridge_power_up_actions()
pci_bridge_wait_for_secondary_bus()
pci_dev_wait()
While a device is runtime suspended along with its PCI hierarchy, the
device could get disconnected. In such case, the link will not come up no
matter how long pci_dev_wait() waits for it.
Besides the above mentioned case, there could be other ways to get the
device disconnected while pci_dev_wait() is waiting for the link to come
up.
Make pci_dev_wait() exit if the device is already disconnected to avoid
unnecessary delay.
The use cases of pci_dev_wait() boil down to two:
1. Waiting for the device after reset
2. pci_bridge_wait_for_secondary_bus()
The callers in both cases seem to benefit from propagating the
disconnection as error even if device disconnection would be more
analoguous to the case where there is no device in the first place which
return 0 from pci_dev_wait(). In the case 2, it results in unnecessary
marking of the devices disconnected again but that is just harmless extra
work.
Also make sure compiler does not become too clever with dev->error_state
and use READ_ONCE() to force a fetch for the up-to-date value.
Link: https://lore.kernel.org/r/20240208132322.4811-1-ilpo.jarvinen@linux.intel.com
Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 256df20c590bf0e4d63ac69330cf23faddac3e08 ]
Hewlett-Packard HP Pavilion 17 Notebook PC/1972 is an Intel Ivy Bridge
system with a muxless AMD Radeon dGPU. Attempting to use the dGPU fails
with the following sequence:
ACPI Error: Aborting method \AMD3._ON due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
radeon 0000:01:00.0: not ready 1023ms after resume; waiting
radeon 0000:01:00.0: not ready 2047ms after resume; waiting
radeon 0000:01:00.0: not ready 4095ms after resume; waiting
radeon 0000:01:00.0: not ready 8191ms after resume; waiting
radeon 0000:01:00.0: not ready 16383ms after resume; waiting
radeon 0000:01:00.0: not ready 32767ms after resume; waiting
radeon 0000:01:00.0: not ready 65535ms after resume; giving up
radeon 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
The issue is that the Root Port the dGPU is connected to can't handle the
transition from D3cold to D0 so the dGPU can't properly exit runtime PM.
The existing logic in pci_bridge_d3_possible() checks for systems that are
newer than 2015 to decide that D3 is safe. This would nominally work for
an Ivy Bridge system (which was discontinued in 2015), but this system
appears to have continued to receive BIOS updates until 2017 and so this
existing logic doesn't appropriately capture it.
Add the system to bridge_d3_blacklist to prevent D3cold from being used.
Link: https://lore.kernel.org/r/20240307163709.323-1-mario.limonciello@amd.com
Reported-by: Eric Heintzmann <heintzmann.eric@free.fr>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3229
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Eric Heintzmann <heintzmann.eric@free.fr>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 2dba285caba53f309d6060fca911b43d63f41697 upstream.
Remove wrong mask on subsys_vendor_id. Both the Vendor ID and Subsystem
Vendor ID are u16 variables and are written to a u32 register of the
controller. The Subsystem Vendor ID was always 0 because the u16 value
was masked incorrectly with GENMASK(31,16) resulting in all lower 16
bits being set to 0 prior to the shift.
Remove both masks as they are unnecessary and set the register correctly
i.e., the lower 16-bits are the Vendor ID and the upper 16-bits are the
Subsystem Vendor ID.
This is documented in the RK3399 TRM section 17.6.7.1.17
[kwilczynski: removed unnecesary newline]
Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
Link: https://lore.kernel.org/linux-pci/20240403144508.489835-1-rick.wertenbroek@gmail.com
Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit e2e78a294a8a863898b781dbcf90e087eda3155d ]
The "Downstream Port Containment related Enhancements" ECN of Jan 28, 2019
(document 12888 below), defined the EDR_PORT_LOCATE_DSM function with
Revision ID 5 with a return value encoding (Bits 2:0 = Function, Bits 7:3 =
Device, Bits 15:8 = Bus). When the ECN was integrated into PCI Firmware
r3.3, sec 4.6.13, Bit 31 was added to indicate success or failure.
Check Bit 31 for failure in acpi_dpc_port_get().
Link: https://lore.kernel.org/r/20240501022543.1626025-1-sathyanarayanan.kuppuswamy@linux.intel.com
Link: https://members.pcisig.com/wg/PCI-SIG/document/12888
Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
[bhelgaas: split into two patches, update commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Satish Thatchanamurthy <Satish.Thatchanamurt@Dell.com> # one platform
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit f24ba846133d0edec785ac6430d4daf6e9c93a09 ]
The "Downstream Port Containment related Enhancements" ECN of Jan 28, 2019
(document 12888 below), defined the EDR_PORT_DPC_ENABLE_DSM function with
Revision ID 5 with Arg3 being an integer. But when the ECN was integrated
into PCI Firmware r3.3, sec 4.6.12, it was defined as Revision ID 6 with
Arg3 being a package containing an integer.
The implementation in acpi_enable_dpc() supplies a package as Arg3 (arg4 in
the code), but it previously specified Revision ID 5. Align this with PCI
Firmware r3.3 by using Revision ID 6.
If firmware implemented per the ECN, its Revision 5 function would receive
a package as Arg3 when it expects an integer, so acpi_enable_dpc() would
likely fail. If such firmware exists and lacks a Revision 6 function that
expects a package, we may have to add support for Revision 5.
Link: https://lore.kernel.org/r/20240501022543.1626025-1-sathyanarayanan.kuppuswamy@linux.intel.com
Link: https://members.pcisig.com/wg/PCI-SIG/document/12888
Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
[bhelgaas: split into two patches, update commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Satish Thatchanamurthy <Satish.Thatchanamurt@Dell.com> # one platform
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 73cb3a35f94db723c0211ad099bce55b2155e3f0 ]
Two changes were made in link retraining logic independent of each other.
The commit e7e39756363a ("PCI/ASPM: Avoid link retraining race") added a
check to pcie_retrain_link() to ensure no Link Training is currently active
to address the Implementation Note in PCIe r6.1 sec 7.5.3.7. At that time
pcie_wait_for_retrain() only checked for the Link Training (LT) bit being
cleared.
The commit 680e9c47a229 ("PCI: Add support for polling DLLLA to
pcie_retrain_link()") generalized pcie_wait_for_retrain() into
pcie_wait_for_link_status() which can wait either for LT or the Data Link
Layer Link Active (DLLLA) bit with 'use_lt' argument and supporting waiting
for either cleared or set using 'active' argument.
In the merge commit 1abb47390350 ("Merge branch 'pci/enumeration'"), those
two divergent branches converged. The merge changed LT bit checking added
in the commit e7e39756363a ("PCI/ASPM: Avoid link retraining race") to now
wait for completion of any ongoing Link Training using DLLLA bit being set
if 'use_lt' is false.
When 'use_lt' is false, the pseudo-code steps of what occurs in
pcie_retrain_link():
1. Wait for DLLLA==1
2. Trigger link to retrain
3. Wait for DLLLA==1
Step 3 waits for the link to come up from the retraining triggered by Step
2. As Step 1 is supposed to wait for any ongoing retraining to end, using
DLLLA also for it does not make sense because link training being active is
still indicated using LT bit, not with DLLLA.
Correct the pcie_wait_for_link_status() parameters in Step 1 to only wait
for LT==0 to ensure there is no ongoing Link Training.
This only impacts the Target Speed quirk, which is the only case where
waiting for DLLLA bit is used. It currently works in the problematic case
by means of link training getting initiated by hardware repeatedly and
respecting the new link parameters set by the caller, which then make
training succeed and bring the link up, setting DLLLA and causing
pcie_wait_for_link_status() to return success. We are not supposed to rely
on luck and need to make sure that LT transitioned through the inactive
state though before we initiate link training by hand via RL (Retrain Link)
bit.
Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'")
Link: https://lore.kernel.org/r/20240423130820.43824-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 19326006a21da26532d982254677c892dae8f29b ]
Tegra194 PCIe probe path is taking failure path in success case for
Endpoint mode. Return success from the switch case instead of going
into the failure path.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Link: https://lore.kernel.org/linux-pci/20240408093053.3948634-1-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 1e560864159d002b453da42bd2c13a1805515a20 upstream.
A last minute revert in 6.7-final introduced a potential deadlock when
enabling ASPM during probe of Qualcomm PCIe controllers as reported by
lockdep:
============================================
WARNING: possible recursive locking detected
6.7.0 #40 Not tainted
--------------------------------------------
kworker/u16:5/90 is trying to acquire lock:
ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pcie_aspm_pm_state_change+0x58/0xdc
but task is already holding lock:
ffffacfa78ced000 (pci_bus_sem){++++}-{3:3}, at: pci_walk_bus+0x34/0xbc
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(pci_bus_sem);
lock(pci_bus_sem);
*** DEADLOCK ***
Call trace:
print_deadlock_bug+0x25c/0x348
__lock_acquire+0x10a4/0x2064
lock_acquire+0x1e8/0x318
down_read+0x60/0x184
pcie_aspm_pm_state_change+0x58/0xdc
pci_set_full_power_state+0xa8/0x114
pci_set_power_state+0xc4/0x120
qcom_pcie_enable_aspm+0x1c/0x3c [pcie_qcom]
pci_walk_bus+0x64/0xbc
qcom_pcie_host_post_init_2_7_0+0x28/0x34 [pcie_qcom]
The deadlock can easily be reproduced on machines like the Lenovo ThinkPad
X13s by adding a delay to increase the race window during asynchronous
probe where another thread can take a write lock.
Add a new pci_set_power_state_locked() and associated helper functions that
can be called with the PCI bus semaphore held to avoid taking the read lock
twice.
Link: https://lore.kernel.org/r/ZZu0qx2cmn7IwTyQ@hovoldconsulting.com
Link: https://lore.kernel.org/r/20240130100243.11011-1-johan+linaro@kernel.org
Fixes: f93e71aea6c6 ("Revert "PCI/ASPM: Remove pcie_aspm_pm_state_change()"")
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: <stable@vger.kernel.org> # 6.7
[bhelgaas: backported to v6.6.y, which contains 8cc22ba3f77c ("Revert
"PCI/ASPM: Remove pcie_aspm_pm_state_change()""), a backport of
f93e71aea6c6. This omits the drivers/pci/controller/dwc/pcie-qcom.c hunk
that updates qcom_pcie_enable_aspm(), which was added by 9f4f3dfad8cf
("PCI: qcom: Enable ASPM for platforms supporting 1.9.0 ops"), which is not
present in v6.6.28.]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 9a9eec4765737b9b2a8d6ae03de6480a5f12dd5c ]
Use FIELD_GET() to remove dependencies on the field position, i.e., the
shift value. No functional change intended.
Link: https://lore.kernel.org/r/20231018113254.17616-5-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b5ff74c1ef50fe08e384026875fec660fadfaedd ]
For a physical PCI device that is passed through to a Hyper-V guest VM,
current code specifies the VMBus ring buffer size as 4 pages. But this
is an inappropriate dependency, since the amount of ring buffer space
needed is unrelated to PAGE_SIZE. For example, on x86 the ring buffer
size ends up as 16 Kbytes, while on ARM64 with 64 Kbyte pages, the ring
size bloats to 256 Kbytes. The ring buffer for PCI pass-thru devices
is used for only a few messages during device setup and removal, so any
space above a few Kbytes is wasted.
Fix this by declaring the ring buffer size to be a fixed 16 Kbytes.
Furthermore, use the VMBUS_RING_SIZE() macro so that the ring buffer
header is properly accounted for, and so the size is rounded up to a
page boundary, using the page size for which the kernel is built. While
w/64 Kbyte pages this results in a 64 Kbyte ring buffer header plus a
64 Kbyte ring buffer, that's the smallest possible with that page size.
It's still 128 Kbytes better than the current code.
Link: https://lore.kernel.org/linux-pci/20240216202240.251818-1-mhklinux@outlook.com
Signed-off-by: Michael Kelley <mhklinux@outlook.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Ilpo Jarvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Long Li <longli@microsoft.com>
Cc: <stable@vger.kernel.org> # 5.15.x
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 72e34b8593e08a0ee759b7a038e0b178418ea6f8 ]
The commit message in commit fc9a77040b04 ("PCI: designware-ep: Configure
Resizable BAR cap to advertise the smallest size") claims that it modifies
the Resizable BAR capability to only advertise support for 1 MB size BARs.
However, the commit writes all zeroes to PCI_REBAR_CAP (the register which
contains the possible BAR sizes that a BAR be resized to).
According to the spec, it is illegal to not have a bit set in
PCI_REBAR_CAP, and 1 MB is the smallest size allowed.
Set bit 4 in PCI_REBAR_CAP, so that we actually advertise support for a
1 MB BAR size.
Before:
Capabilities: [2e8 v1] Physical Resizable BAR
BAR 0: current size: 1MB
BAR 1: current size: 1MB
BAR 2: current size: 1MB
BAR 3: current size: 1MB
BAR 4: current size: 1MB
BAR 5: current size: 1MB
After:
Capabilities: [2e8 v1] Physical Resizable BAR
BAR 0: current size: 1MB, supported: 1MB
BAR 1: current size: 1MB, supported: 1MB
BAR 2: current size: 1MB, supported: 1MB
BAR 3: current size: 1MB, supported: 1MB
BAR 4: current size: 1MB, supported: 1MB
BAR 5: current size: 1MB, supported: 1MB
Fixes: fc9a77040b04 ("PCI: designware-ep: Configure Resizable BAR cap to advertise the smallest size")
Link: https://lore.kernel.org/linux-pci/20240307111520.3303774-1-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: <stable@vger.kernel.org> # 5.2
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit bf79e33cdd89db498e00a6131e937259de5f2705 ]
Qcom SoCs making use of ARM SMMU require BDF to SID translation table in
the driver to properly map the SID for the PCIe devices based on their BDF
identifier. This is currently achieved with the help of
qcom_pcie_config_sid_1_9_0() function for SoCs supporting the 1_9_0 config.
But With newer Qcom SoCs starting from SM8450, BDF to SID translation is
set to bypass mode by default in hardware. Due to this, the translation
table that is set in the qcom_pcie_config_sid_1_9_0() is essentially
unused and the default SID is used for all endpoints in SoCs starting from
SM8450.
This is a security concern and also warrants swapping the DeviceID in DT
while using the GIC ITS to handle MSIs from endpoints. The swapping is
currently done like below in DT when using GIC ITS:
/*
* MSIs for BDF (1:0.0) only works with Device ID 0x5980.
* Hence, the IDs are swapped.
*/
msi-map = <0x0 &gic_its 0x5981 0x1>,
<0x100 &gic_its 0x5980 0x1>;
Here, swapping of the DeviceIDs ensure that the endpoint with BDF (1:0.0)
gets the DeviceID 0x5980 which is associated with the default SID as per
the iommu mapping in DT. So MSIs were delivered with IDs swapped so far.
But this also means the Root Port (0:0.0) won't receive any MSIs (for PME,
AER etc...)
So let's fix these issues by clearing the BDF to SID bypass mode for all
SoCs making use of the 1_9_0 config. This allows the PCIe devices to use
the correct SID, thus avoiding the DeviceID swapping hack in DT and also
achieving the isolation between devices.
Fixes: 4c9398822106 ("PCI: qcom: Add support for configuring BDF to SID mapping for SM8250")
Link: https://lore.kernel.org/linux-pci/20240307-pci-bdf-sid-fix-v1-1-9423a7e2d63c@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Cc: stable@vger.kernel.org # 5.11
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 002bf2fbc00e5c4b95fb167287e2ae7d1973281e ]
PM runtime can be done simultaneously with AER error handling. Avoid that
by using pm_runtime_get_sync() before and pm_runtime_put() after reset in
pcie_do_recovery() for all recovering devices.
pm_runtime_get_sync() will increase dev->power.usage_count counter to
prevent any possible future request to runtime suspend a device. It will
also resume a device, if it was previously in D3hot state.
I tested with igc device by doing simultaneous aer_inject and rpm
suspend/resume via /sys/bus/pci/devices/PCI_ID/power/control and can
reproduce:
igc 0000:02:00.0: not ready 65535ms after bus reset; giving up
pcieport 0000:00:1c.2: AER: Root Port link has been reset (-25)
pcieport 0000:00:1c.2: AER: subordinate device reset failed
pcieport 0000:00:1c.2: AER: device recovery failed
igc 0000:02:00.0: Unable to change power state from D3hot to D0, device inaccessible
The problem disappears when this patch is applied.
Link: https://lore.kernel.org/r/20240212120135.146068-1-stanislaw.gruszka@linux.intel.com
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 627c6db20703b5d18d928464f411d0d4ec327508 ]
Commit 5459c0b70467 ("PCI/DPC: Quirk PIO log size for certain Intel Root
Ports") and commit 3b8803494a06 ("PCI/DPC: Quirk PIO log size for Intel Ice
Lake Root Ports") add quirks for Ice, Tiger and Alder Lake Root Ports.
System firmware for Raptor Lake still has the bug, so Linux logs the
warning below on several Raptor Lake systems like Dell Precision 3581 with
Intel Raptor Lake processor (0W18NX) system firmware/BIOS version 1.10.1.
pci 0000:00:07.0: [8086:a76e] type 01 class 0x060400
pci 0000:00:07.0: DPC: RP PIO log size 0 is invalid
pci 0000:00:07.1: [8086:a73f] type 01 class 0x060400
pci 0000:00:07.1: DPC: RP PIO log size 0 is invalid
Apply the quirk for Raptor Lake Root Ports as well.
This also enables the DPC driver to dump the RP PIO Log registers when DPC
is triggered.
Link: https://lore.kernel.org/r/20240305113057.56468-1-pmenzel@molgen.mpg.de
Reported-by: Niels van Aert <nvaert1986@hotmail.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218560
Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: <stable@vger.kernel.org>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: Niels van Aert <nvaert1986@hotmail.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9d5286d4e7f68beab450deddbb6a32edd5ecf4bf ]
A race condition between the .runtime_idle() callback and the .remove()
callback in the rtsx_pcr PCI driver leads to a kernel crash due to an
unhandled page fault [1].
The problem is that rtsx_pci_runtime_idle() is not expected to be running
after pm_runtime_get_sync() has been called, but the latter doesn't really
guarantee that. It only guarantees that the suspend and resume callbacks
will not be running when it returns.
However, if a .runtime_idle() callback is already running when
pm_runtime_get_sync() is called, the latter will notice that the runtime PM
status of the device is RPM_ACTIVE and it will return right away without
waiting for the former to complete. In fact, it cannot wait for
.runtime_idle() to complete because it may be called from that callback (it
arguably does not make much sense to do that, but it is not strictly
prohibited).
Thus in general, whoever is providing a .runtime_idle() callback needs
to protect it from running in parallel with whatever code runs after
pm_runtime_get_sync(). [Note that .runtime_idle() will not start after
pm_runtime_get_sync() has returned, but it may continue running then if it
has started earlier.]
One way to address that race condition is to call pm_runtime_barrier()
after pm_runtime_get_sync() (not before it, because a nonzero value of the
runtime PM usage counter is necessary to prevent runtime PM callbacks from
being invoked) to wait for the .runtime_idle() callback to complete should
it be running at that point. A suitable place for doing that is in
pci_device_remove() which calls pm_runtime_get_sync() before removing the
driver, so it may as well call pm_runtime_barrier() subsequently, which
will prevent the race in question from occurring, not just in the rtsx_pcr
driver, but in any PCI drivers providing .runtime_idle() callbacks.
Link: https://lore.kernel.org/lkml/20240229062201.49500-1-kai.heng.feng@canonical.com/ # [1]
Link: https://lore.kernel.org/r/5761426.DvuYhMxLoT@kreacher
Reported-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Ricky Wu <ricky_wu@realtek.com>
Acked-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 039741a8d7c9a01c1bc84a5ac5aa770a5e138a30 ]
The MDIO_WT_DONE() macro tests bit 31, which is always 0 (== done) as
readw_poll_timeout_atomic() does a 16-bit read. Replace with the readl
variant.
[kwilczynski: commit log]
Fixes: ca5dcc76314d ("PCI: brcmstb: Replace status loops with read_poll_timeout_atomic()")
Link: https://lore.kernel.org/linux-pci/20240217133722.14391-1-wahrenst@gmx.net
Signed-off-by: Jonathan Bell <jonathan@raspberrypi.com>
Signed-off-by: Stefan Wahren <wahrenst@gmx.net>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit baf67aefbe7d7deafa59ca49612d163f8889934c ]
Per PCIe r6.1, sec 2.2.6.2 and 7.5.3.4, a Requester may not use 8-bit Tags
unless its Extended Tag Field Enable is set, but all Receivers/Completers
must handle 8-bit Tags correctly regardless of their Extended Tag Field
Enable.
Some devices do not handle 8-bit Tags as Completers, so add a quirk for
them. If we find such a device, we disable Extended Tags for the entire
hierarchy to make peer-to-peer DMA possible.
The 3ware 9650SE seems to have issues with handling 8-bit tags. Mark it as
broken.
This fixes PCI Parity Errors like :
3w-9xxx: scsi0: ERROR: (0x06:0x000C): PCI Parity Error: clearing.
3w-9xxx: scsi0: ERROR: (0x06:0x000D): PCI Abort: clearing.
3w-9xxx: scsi0: ERROR: (0x06:0x000E): Controller Queue Error: clearing.
3w-9xxx: scsi0: ERROR: (0x06:0x0010): Microcontroller Error: clearing.
Link: https://lore.kernel.org/r/20240219132811.8351-1-joerg@wedekind.de
Fixes: 60db3a4d8cc9 ("PCI: Enable PCIe Extended Tags if supported")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=202425
Signed-off-by: Jörg Wedekind <joerg@wedekind.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit aebfdfe39b9327a3077d0df8db3beb3160c9bdd0 ]
If device_register() fails in ntb_register_device(), the device name
allocated by dev_set_name() should be freed. As per the comment in
device_register(), callers should use put_device() to give up the
reference in the error path. So fix this by calling put_device() in the
error path so that the name can be freed in kobject_cleanup().
As a result of this, put_device() in the error path of
ntb_register_device() is removed and the actual error is returned.
Fixes: a1bd3baeb2f1 ("NTB: Add NTB hardware abstraction layer")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://lore.kernel.org/r/20231201033057.1399131-1-yangyingliang@huaweicloud.com
[mani: reworded commit message]
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit dec529b0b0572b32f9eb91c882dd1f08ca657efb ]
The commit in Fixes changed the logic on how resources are released and
introduced a new switchtec_exit_pci() that need to be called explicitly in
order to undo a corresponding switchtec_init_pci().
This was done in the remove function, but not in the probe.
Fix the probe now.
Fixes: df25461119d9 ("PCI: switchtec: Fix stdev_release() crash after surprise hot remove")
Link: https://lore.kernel.org/r/01446d2ccb91a578239915812f2b7dfbeb2882af.1703428183.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1e5c66afd4a40bb7be17cb33cbb1a1085f727730 ]
It is not allowed to sleep within a RCU read section, so use GFP_ATOMIC
instead of GFP_KERNEL here.
Link: https://lore.kernel.org/r/02d9ec4a10235def0e764ff1f5be881ba12e16e8.1704397858.git.christophe.jaillet@wanadoo.fr
Fixes: ae21f835a5bd ("PCI/P2PDMA: Finish RCU conversion of pdev->p2pdma")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 6568d82512b0a64809acff3d7a747362fa4288c8 ]
The TLP Prefix Log Register consists of multiple DWORDs (PCIe r6.1 sec
7.9.14.13) but the loop in dpc_process_rp_pio_error() keeps reading from
the first DWORD, so we print only the first PIO TLP Prefix (duplicated
several times), and we never print the second, third, etc., Prefixes.
Add the iteration count based offset calculation into the config read.
Fixes: f20c4ea49ec4 ("PCI/DPC: Add eDPC support")
Link: https://lore.kernel.org/r/20240118110815.3867-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: add user-visible details to commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 39714fd73c6b60a8d27bcc5b431afb0828bf4434 ]
Make pci_dev_is_disconnected() public so that it can be called from
Intel VT-d driver to quickly fix/workaround the surprise removal
unplug hang issue for those ATS capable devices on PCIe switch downstream
hotplug capable ports.
Beside pci_device_is_present() function, this one has no config space
space access, so is light enough to optimize the normal pure surprise
removal and safe removal flow.
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org>
Tested-by: Haorong Ye <yehaorong@bytedance.com>
Signed-off-by: Ethan Zhao <haifeng.zhao@linux.intel.com>
Link: https://lore.kernel.org/r/20240301080727.3529832-2-haifeng.zhao@linux.intel.com
Signed-off-by: Lu Baolu <baolu.lu@linux.intel.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Stable-dep-of: 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected")
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit db744ddd59be798c2627efbfc71f707f5a935a40 upstream.
While calculating the hardware interrupt number for a MSI interrupt, the
higher bits (i.e. from bit-5 onwards a.k.a domain_nr >= 32) of the PCI
domain number gets truncated because of the shifted value casting to return
type of pci_domain_nr() which is 'int'. This for example is resulting in
same hardware interrupt number for devices 0019:00:00.0 and 0039:00:00.0.
To address this cast the PCI domain number to 'irq_hw_number_t' before left
shifting it to calculate the hardware interrupt number.
Please note that this fixes the issue only on 64-bit systems and doesn't
change the behavior for 32-bit systems i.e. the 32-bit systems continue to
have the issue. Since the issue surfaces only if there are too many PCIe
controllers in the system which usually is the case in modern server
systems and they don't tend to run 32-bit kernels.
Fixes: 3878eaefb89a ("PCI/MSI: Enhance core to support hierarchy irqdomain")
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Shanker Donthineni <sdonthineni@nvidia.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240115135649.708536-1-vidyas@nvidia.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b5d1b4b46f856da1473c7ba9a5cdfcb55c9b2478 upstream.
The "msg_addr" variable is u64. However, the "aligned_offset" is an
unsigned int. This means that when the code does:
msg_addr &= ~aligned_offset;
it will unintentionally zero out the high 32 bits. Use ALIGN_DOWN() to do
the alignment instead.
Fixes: 2217fffcd63f ("PCI: dwc: endpoint: Fix dw_pcie_ep_raise_msix_irq() alignment support")
Link: https://lore.kernel.org/r/af59c7ad-ab93-40f7-ad4a-7ac0b14d37f5@moroto.mountain
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
[ Upstream commit 41044d5360685e78a869d40a168491a70cdb7e73 ]
The commit noted in fixes added a bogus requirement that runtime PM managed
devices need to be in the RPM_ACTIVE state for PME polling. In fact, only
devices in low power states should be polled.
However there's still a requirement that the device config space must be
accessible, which has implications for both the current state of the polled
device and the parent bridge, when present. It's not sufficient to assume
the bridge remains in D0 and cases have been observed where the bridge
passes the D0 test, but the PM state indicates RPM_SUSPENDING and config
space of the polled device becomes inaccessible during pci_pme_wakeup().
Therefore, since the bridge is already effectively required to be in the
RPM_ACTIVE state, formalize this in the code and elevate the PM usage count
to maintain the state while polling the subordinate device.
This resolves a regression reported in the bugzilla below where a
Thunderbolt/USB4 hierarchy fails to scan for an attached NVMe endpoint
downstream of a bridge in a D3hot power state.
Link: https://lore.kernel.org/r/20240123185548.1040096-1-alex.williamson@redhat.com
Fixes: d3fcd7360338 ("PCI: Fix runtime PM race with PME polling")
Reported-by: Sanath S <sanath.s@amd.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218360
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Sanath S <sanath.s@amd.com>
Reviewed-by: Rafael J. Wysocki <rafael@kernel.org>
Cc: Lukas Wunner <lukas@wunner.de>
Cc: Mika Westerberg <mika.westerberg@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 1291b716bbf969e101d517bfb8ba18d958f758b8 ]
When a device with AER detects an error, it logs error information in its
own AER Error Status registers. It may send an Error Message to the Root
Port (RCEC in the case of an RCiEP), which logs the fact that an Error
Message was received (Root Error Status) and the Requester ID of the
message source (Error Source Identification).
aer_print_port_info() prints the Requester ID from the Root Port Error
Source in the usual Linux "bb:dd.f" format, but when find_source_device()
finds no error details in the hierarchy below the Root Port, it printed the
raw Requester ID without decoding it.
Decode the Requester ID in the usual Linux format so it matches other
messages.
Sample message changes:
- pcieport 0000:00:1c.5: AER: Correctable error received: 0000:00:1c.5
- pcieport 0000:00:1c.5: AER: can't find device of ID00e5
+ pcieport 0000:00:1c.5: AER: Correctable error message received from 0000:00:1c.5
+ pcieport 0000:00:1c.5: AER: found no error details for 0000:00:1c.5
Link: https://lore.kernel.org/r/20231206224231.732765-3-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit df25461119d987b8c81d232cfe4411e91dcabe66 ]
A PCI device hot removal may occur while stdev->cdev is held open. The call
to stdev_release() then happens during close or exit, at a point way past
switchtec_pci_remove(). Otherwise the last ref would vanish with the
trailing put_device(), just before return.
At that later point in time, the devm cleanup has already removed the
stdev->mmio_mrpc mapping. Also, the stdev->pdev reference was not a counted
one. Therefore, in DMA mode, the iowrite32() in stdev_release() will cause
a fatal page fault, and the subsequent dma_free_coherent(), if reached,
would pass a stale &stdev->pdev->dev pointer.
Fix by moving MRPC DMA shutdown into switchtec_pci_remove(), after
stdev_kill(). Counting the stdev->pdev ref is now optional, but may prevent
future accidents.
Reproducible via the script at
https://lore.kernel.org/r/20231113212150.96410-1-dns@arista.com
Link: https://lore.kernel.org/r/20231122042316.91208-2-dns@arista.com
Signed-off-by: Daniel Stodden <dns@arista.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Dmitry Safonov <dima@arista.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit e585a37e5061f6d5060517aed1ca4ccb2e56a34c ]
By running a Van Gogh device (Steam Deck), the following message
was noticed in the kernel log:
pci 0000:04:00.3: PCI class overridden (0x0c03fe -> 0x0c03fe) so dwc3 driver can claim this instead of xhci
Effectively this means the quirk executed but changed nothing, since the
class of this device was already the proper one (likely adjusted by newer
firmware versions).
Check and perform the override only if necessary.
Link: https://lore.kernel.org/r/20231120160531.361552-1-gpiccoli@igalia.com
Signed-off-by: Guilherme G. Piccoli <gpiccoli@igalia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Huang Rui <ray.huang@amd.com>
Cc: Vicki Pfau <vi@endrift.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 3ed48c80b28d8dcd584d6ddaf00c75b7673e1a05 ]
Spectrum-{1,2,3,4} devices report that a D3hot->D0 transition causes a
reset (i.e., they advertise NoSoftRst-). However, this transition does
not have any effect on the device: It continues to be operational and
network ports remain up. Advertising this support makes it seem as if a
PM reset is viable for these devices. Mark it as unavailable to skip it
when testing reset methods.
Before:
# cat /sys/bus/pci/devices/0000\:03\:00.0/reset_method
pm bus
After:
# cat /sys/bus/pci/devices/0000\:03\:00.0/reset_method
bus
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 9ccc1318cf4bd90601f221268e42c3374703d681 ]
When using the fls() helper, the translation table should be a power of
two; otherwise, the resulting value will not be correct.
For example, given fls(0x3e00000) - 1 = 25, the PCIe translation window
size will be set to 0x2000000 instead of the expected size 0x3e00000.
Fix the translation window by splitting the MMIO space into multiple tables
if its size is not a power of two.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20231023081423.18559-1-jianjun.wang@mediatek.com
Fixes: d3bf75b579b9 ("PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192")
Signed-off-by: Jianjun Wang <jianjun.wang@mediatek.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit c12ca110c613a81cb0f0099019c839d078cd0f38 ]
The PCI driver invokes the PHY APIs using the ks_pcie_enable_phy()
function. The PHY in this case is the Serdes. It is possible that the
PCI instance is configured for two lane operation across two different
Serdes instances, using one lane of each Serdes.
In such a configuration, if the reference clock for one Serdes is
provided by the other Serdes, it results in a race condition. After the
Serdes providing the reference clock is initialized by the PCI driver by
invoking its PHY APIs, it is not guaranteed that this Serdes remains
powered on long enough for the PHY APIs based initialization of the
dependent Serdes. In such cases, the PLL of the dependent Serdes fails
to lock due to the absence of the reference clock from the former Serdes
which has been powered off by the PM Core.
Fix this by obtaining reference to the PHYs before invoking the PHY
initialization APIs and releasing reference after the initialization is
complete.
Link: https://lore.kernel.org/linux-pci/20230927041845.1222080-1-s-vadapalli@ti.com
Fixes: 49229238ab47 ("PCI: keystone: Cleanup PHY handling")
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Acked-by: Ravi Gunasekaran <r-gunasekaran@ti.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit 327ec5f70609cf00c6f961073c01857555c6a8eb ]
In the error path of pci_epf_mhi_edma_write() function, the DMA data
direction passed (DMA_FROM_DEVICE) doesn't match the actual direction used
for the data transfer. Fix it by passing the correct one (DMA_TO_DEVICE).
Fixes: 7b99aaaddabb ("PCI: epf-mhi: Add eDMA support")
Reviewed-by: Krzysztof Wilczyński <kw@linux.com>
Link: https://lore.kernel.org/r/20231214063328.40657-1-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
[ Upstream commit b08ded2ef2e98768d5ee5f71da8fe768b1f7774b ]
In the preparation of DMA async support, let's pass the parameters to
read_from_host() and write_to_host() APIs using mhi_ep_buf_info structure.
No functional change.
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Stable-dep-of: 327ec5f70609 ("PCI: epf-mhi: Fix the DMA data direction of dma_unmap_single()")
Signed-off-by: Sasha Levin <sashal@kernel.org>
commit 4e11c29873a8a296a20f99b3e03095e65ebf897d upstream.
We found a failure when using the iperf tool during WiFi performance
testing, where some MSIs were received while clearing the interrupt
status, and these MSIs cannot be serviced.
The interrupt status can be cleared even if the MSI status remains pending.
As such, given the edge-triggered interrupt type, its status should be
cleared before being dispatched to the handler of the underling device.
[kwilczynski: commit log, code comment wording]
Link: https://lore.kernel.org/linux-pci/20231211094923.31967-1-jianjun.wang@mediatek.com
Fixes: 43e6409db64d ("PCI: mediatek: Add MSI support for MT2712 and MT7622")
Signed-off-by: qizhong cheng <qizhong.cheng@mediatek.com>
Signed-off-by: Jianjun Wang <jianjun.wang@mediatek.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: rewrap comment]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2217fffcd63f86776c985d42e76daa43a56abdf1 upstream.
Commit 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get
correct MSI-X table address") modified dw_pcie_ep_raise_msix_irq() to
support iATUs which require a specific alignment.
However, this support cannot have been properly tested.
The whole point is for the iATU to map an address that is aligned,
using dw_pcie_ep_map_addr(), and then let the writel() write to
ep->msi_mem + aligned_offset.
Thus, modify the address that is mapped such that it is aligned.
With this change, dw_pcie_ep_raise_msix_irq() matches the logic in
dw_pcie_ep_raise_msi_irq().
Link: https://lore.kernel.org/linux-pci/20231128132231.2221614-1-nks@flawful.org
Fixes: 6f5e193bfb55 ("PCI: dwc: Fix dw_pcie_ep_raise_msix_irq() to get correct MSI-X table address")
Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: stable@vger.kernel.org # 5.7
Cc: Kishon Vijay Abraham I <kishon@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f93e71aea6c60ebff8adbd8941e678302d377869 upstream.
This reverts commit 08d0cc5f34265d1a1e3031f319f594bd1970976c.
Michael reported that when attempting to resume from suspend to RAM on ASUS
mini PC PN51-BB757MDE1 (DMI model: MINIPC PN51-E1), 08d0cc5f3426
("PCI/ASPM: Remove pcie_aspm_pm_state_change()") caused a 12-second delay
with no output, followed by a reboot.
Workarounds include:
- Reverting 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()")
- Booting with "pcie_aspm=off"
- Booting with "pcie_aspm.policy=performance"
- "echo 0 | sudo tee /sys/bus/pci/devices/0000:03:00.0/link/l1_aspm"
before suspending
- Connecting a USB flash drive
Link: https://lore.kernel.org/r/20240102232550.1751655-1-helgaas@kernel.org
Fixes: 08d0cc5f3426 ("PCI/ASPM: Remove pcie_aspm_pm_state_change()")
Reported-by: Michael Schaller <michael@5challer.de>
Link: https://lore.kernel.org/r/76c61361-b8b4-435f-a9f1-32b716763d62@5challer.de
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 49de0dc87965079a8e2803ee4b39f9d946259423 upstream.
The vmd_pm_enable_quirk() helper is called from pci_walk_bus() during
probe to enable ASPM for controllers with VMD_FEAT_BIOS_PM_QUIRK set.
Since pci_walk_bus() already holds a pci_bus_sem read lock, use
pci_enable_link_state_locked() to enable link states in order to avoid a
potential deadlock (e.g. in case someone takes a write lock before
reacquiring the read lock).
Fixes: f492edb40b54 ("PCI: vmd: Add quirk to configure PCIe ASPM and LTR")
Link: https://lore.kernel.org/r/20231128081512.19387-3-johan+linaro@kernel.org
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
[bhelgaas: add "potential" in subject since the deadlock has only been
reported by lockdep, include helper name in commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: <stable@vger.kernel.org> # 6.3
Cc: Michael Bottini <michael.a.bottini@linux.intel.com>
Cc: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 718ab8226636a1a3a7d281f5d6a7ad7c925efe5a upstream.
Add pci_enable_link_state_locked() for enabling link states that can be
used in contexts where a pci_bus_sem read lock is already held (e.g. from
pci_walk_bus()).
This helper will be used to fix a couple of potential deadlocks where
the current helper is called with the lock already held, hence the CC
stable tag.
Fixes: f492edb40b54 ("PCI: vmd: Add quirk to configure PCIe ASPM and LTR")
Link: https://lore.kernel.org/r/20231128081512.19387-2-johan+linaro@kernel.org
Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
[bhelgaas: include helper name in subject, commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: <stable@vger.kernel.org> # 6.3
Cc: Michael Bottini <michael.a.bottini@linux.intel.com>
Cc: David E. Box <david.e.box@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef61a0405742a9f7f6051bc6fd2f017d87d07911 upstream.
This is a partial revert of 8b3517f88ff2 ("PCI: loongson: Prevent LS7A MRRS
increases") for MIPS-based Loongson.
Some MIPS Loongson systems don't support arbitrary Max_Read_Request_Size
(MRRS) settings. 8b3517f88ff2 ("PCI: loongson: Prevent LS7A MRRS
increases") worked around that by (1) assuming that firmware configured
MRRS to the maximum supported value and (2) preventing the PCI core from
increasing MRRS.
Unfortunately, some firmware doesn't set that maximum MRRS correctly, which
results in devices not being initialized correctly. One symptom, from the
Debian report below, is this:
ata4.00: exception Emask 0x0 SAct 0x20000000 SErr 0x0 action 0x6 frozen
ata4.00: failed command: WRITE FPDMA QUEUED
ata4.00: cmd 61/20:e8:00:f0:e1/00:00:00:00:00/40 tag 29 ncq dma 16384 out
res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
ata4.00: status: { DRDY }
ata4: hard resetting link
Limit MRRS to 256 because MIPS Loongson with higher MRRS support is
considered rare.
This must be done at device enablement stage because the MRRS setting may
get lost if PCI_COMMAND_MASTER on the parent bridge is cleared, and we are
only sure parent bridge is enabled at this point.
Fixes: 8b3517f88ff2 ("PCI: loongson: Prevent LS7A MRRS increases")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217680
Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1035587
Link: https://lore.kernel.org/r/20231201115028.84351-1-jiaxun.yang@flygoat.com
Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Huacai Chen <chenhuacai@loongson.cn>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 5df12742b7e3aae2594a30a9d14d5d6e9e7699f4 upstream.
This reverts commit 40613da52b13fb21c5566f10b287e0ca8c12c4e9 and the
subsequent fix to it:
cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus")
40613da52b13 fixed a problem where hot-adding a device with large BARs
failed if the bridge windows programmed by firmware were not large enough.
cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources()
only for non-root bus") fixed a problem with 40613da52b13: an ACPI hot-add
of a device on a PCI root bus (common in the virt world) or firmware
sending ACPI Bus Check to non-existent Root Ports (e.g., on Dell Inspiron
7352/0W6WV0) caused a NULL pointer dereference and suspend/resume hangs.
Unfortunately the combination of 40613da52b13 and cc22522fd55e caused other
problems:
- Fiona reported that hot-add of SCSI disks in QEMU virtual machine fails
sometimes.
- Dongli reported a similar problem with hot-add of SCSI disks.
- Jonathan reported a console freeze during boot on bare metal due to an
error in radeon GPU initialization.
Revert both patches to avoid adding these problems. This means we will
again see the problems with hot-adding devices with large BARs and the NULL
pointer dereferences and suspend/resume issues that 40613da52b13 and
cc22522fd55e were intended to fix.
Fixes: 40613da52b13 ("PCI: acpiphp: Reassign resources on bridge if necessary")
Fixes: cc22522fd55e ("PCI: acpiphp: Use pci_assign_unassigned_bridge_resources() only for non-root bus")
Reported-by: Fiona Ebner <f.ebner@proxmox.com>
Closes: https://lore.kernel.org/r/9eb669c0-d8f2-431d-a700-6da13053ae54@proxmox.com
Reported-by: Dongli Zhang <dongli.zhang@oracle.com>
Closes: https://lore.kernel.org/r/3c4a446a-b167-11b8-f36f-d3c1b49b42e9@oracle.com
Reported-by: Jonathan Woithe <jwoithe@just42.net>
Closes: https://lore.kernel.org/r/ZXpaNCLiDM+Kv38H@marvin.atrad.com.au
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c9260693aa0c1e029ed23693cfd4d7814eee6624 upstream.
Commit ac91e6980563 ("PCI: Unify delay handling for reset and resume")
shortened an unconditional 1 sec delay after a Secondary Bus Reset to 100
msec for PCIe (per PCIe r6.1 sec 6.6.1). The 1 sec delay is only required
for Conventional PCI.
But it turns out that there are PCIe devices which require a longer delay
than prescribed before first config space access after reset recovery or
resume from D3cold:
Chad reports that a "VideoPropulsion Torrent QN16e" MPEG QAM Modulator
"raises a PCI system error (PERR), as reported by the IPMI event log, and
the hardware itself would suffer a catastrophic event, cycling the server"
unless the longer delay is observed.
The card is specified to conform to PCIe r1.0 and indeed only supports Gen1
speed (2.5 GT/s) according to lspci. PCIe r1.0 sec 7.6 prescribes the same
100 msec delay as PCIe r6.1 sec 6.6.1:
To allow components to perform internal initialization, system software
must wait for at least 100 ms from the end of a reset (cold/warm/hot)
before it is permitted to issue Configuration Requests
The behavior of the Torrent QN16e card thus appears to be a quirk. Treat
it as such and lengthen the reset delay for this specific device.
Fixes: ac91e6980563 ("PCI: Unify delay handling for reset and resume")
Link: https://lore.kernel.org/r/47727e792c7f0282dc144e3ec8ce8eb6e713394e.1695304512.git.lukas@wunner.de
Reported-by: Chad Schroeder <CSchroeder@sonifi.com>
Closes: https://lore.kernel.org/linux-pci/DM6PR16MB2844903E34CAB910082DF019B1FAA@DM6PR16MB2844.namprd16.prod.outlook.com/
Tested-by: Chad Schroeder <CSchroeder@sonifi.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org # v5.4+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 83a939f0fdc208ff3639dd3d42ac9b3c35607fd2 upstream.
With CONFIG_PCI_EXYNOS=y and exynos_pcie_remove() marked with __exit, the
function is discarded from the driver. In this case a bound device can
still get unbound, e.g via sysfs. Then no cleanup code is run resulting in
resource leaks or worse.
The right thing to do is do always have the remove callback available.
This fixes the following warning by modpost:
WARNING: modpost: drivers/pci/controller/dwc/pci-exynos: section mismatch in reference: exynos_pcie_driver+0x8 (section: .data) -> exynos_pcie_remove (section: .exit.text)
(with ARCH=x86_64 W=1 allmodconfig).
Fixes: 340cba6092c2 ("pci: Add PCIe driver for Samsung Exynos")
Link: https://lore.kernel.org/r/20231001170254.2506508-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alim Akhtar <alim.akhtar@samsung.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3064ef2e88c1629c1e67a77d7bc20020b35846f2 upstream.
With CONFIG_PCIE_KIRIN=y and kirin_pcie_remove() marked with __exit, the
function is discarded from the driver. In this case a bound device can
still get unbound, e.g via sysfs. Then no cleanup code is run resulting in
resource leaks or worse.
The right thing to do is do always have the remove callback available.
This fixes the following warning by modpost:
drivers/pci/controller/dwc/pcie-kirin: section mismatch in reference: kirin_pcie_driver+0x8 (section: .data) -> kirin_pcie_remove (section: .exit.text)
(with ARCH=x86_64 W=1 allmodconfig).
Fixes: 000f60db784b ("PCI: kirin: Add support for a PHY layer")
Link: https://lore.kernel.org/r/20231001170254.2506508-3-u.kleine-koenig@pengutronix.de
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 8e37372ad0bea4c9b4712d9943f6ae96cff9491f upstream.
aspm_attr_store_common(), which handles sysfs control of ASPM, has the same
problem as fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver
disables L1"): disabling L1 adds only ASPM_L1 (but not any of the L1.x
substates) to the "aspm_disable" mask.
Enabling one substate, e.g., L1.1, via sysfs removes ASPM_L1 from the
disable mask. Since disabling L1 via sysfs doesn't add any of the
substates to the disable mask, enabling L1.1 actually enables *all* the
substates.
In this scenario:
- Write 0 to "l1_aspm" to disable L1
- Write 1 to "l1_1_aspm" to enable L1.1
the intention is to disable L1 and all L1.x substates, then enable just
L1.1, but in fact, *all* L1.x substates are enabled.
Fix this by explicitly disabling all the L1.x substates when disabling L1.
Fixes: 72ea91afbfb0 ("PCI/ASPM: Add sysfs attributes for controlling ASPM link states")
Link: https://lore.kernel.org/r/6ba7dd79-9cfe-4ed0-a002-d99cb842f361@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a07d2497ed657eb2efeb967af47e22f573dcd1d6 upstream.
The DWC core driver exposes the write_dbi2() callback for writing to the
DBI2 registers in a vendor-specific way.
On the Qcom EP platforms, the DBI_CS2 bit in the ELBI region needs to be
asserted before writing to any DBI2 registers and deasserted once done.
So, let's implement the callback for the Qcom PCIe EP driver so that the
DBI2 writes are correctly handled in the hardware.
Without this callback, the DBI2 register writes like BAR size won't go
through and as a result, the default BAR size is set for all BARs.
[kwilczynski: commit log, renamed function to match the DWC convention]
Fixes: f55fee56a631 ("PCI: qcom-ep: Add Qualcomm PCIe Endpoint controller driver")
Suggested-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/linux-pci/20231025130029.74693-2-manivannan.sadhasivam@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Cc: stable@vger.kernel.org # 5.16+
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>