IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
Keith reports a use-after-free when a DPC event occurs concurrently to
hot-removal of the same portion of the hierarchy:
The dpc_handler() awaits readiness of the secondary bus below the
Downstream Port where the DPC event occurred. To do so, it polls the
config space of the first child device on the secondary bus. If that
child device is concurrently removed, accesses to its struct pci_dev
cause the kernel to oops.
That's because pci_bridge_wait_for_secondary_bus() neglects to hold a
reference on the child device. Before v6.3, the function was only
called on resume from system sleep or on runtime resume. Holding a
reference wasn't necessary back then because the pciehp IRQ thread
could never run concurrently. (On resume from system sleep, IRQs are
not enabled until after the resume_noirq phase. And runtime resume is
always awaited before a PCI device is removed.)
However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness
of secondary bus after reset"), which introduced that, failed to
appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
reference on the child device because dpc_handler() and pciehp may
indeed run concurrently. The commit was backported to v5.10+ stable
kernels, so that's the oldest one affected.
Add the missing reference acquisition.
Abridged stack trace:
BUG: unable to handle page fault for address: 00000000091400c0
CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0
RIP: pci_bus_read_config_dword+0x17/0x50
pci_dev_wait()
pci_bridge_wait_for_secondary_bus()
dpc_reset_link()
pcie_do_recovery()
dpc_handler()
Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset")
Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/
Link: https://lore.kernel.org/linux-pci/8e4bcd4116fd94f592f2bf2749f168099c480ddf.1718707743.git.lukas@wunner.de
Reported-by: Keith Busch <kbusch@kernel.org>
Tested-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Cc: stable@vger.kernel.org # v5.10+
Sequence for controlling the LTSSM state machine is going to change
for SoCs like r8a779f0. Move the LTSSM code to a new callback
ltssm_control() and populate it for each SoCs.
This also warrants the addition of new compatibles for r8a779g0 and
r8a779h0. But since they are already part of the DT binding, it won't
make any difference.
Link: https://lore.kernel.org/linux-pci/20240611125057.1232873-4-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
In order to support future SoCs such as r8a779g0 (R-Car V4H) and
r8a779h0 (R-Car V4M) that require different initialization settings,
introduce SoC specific driver data with the initial member being the
device mode.
No functional change.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20240611125057.1232873-3-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
R-Car Gen4 PCIe controller needs to use the Synopsys-specific PCIe
configuration registers. So, add the macros.
Link: https://lore.kernel.org/linux-pci/20240611125057.1232873-2-yoshihiro.shimoda.uh@renesas.com
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Errata #i2037 in AM65x/DRA80xM Processors Silicon Revision 1.0
(SPRZ452D_July 2018_Revised December 2019 [1]) mentions when an
inbound PCIe TLP spans more than two internal AXI 128-byte bursts,
the bus may corrupt the packet payload and the corrupt data may
cause associated applications or the processor to hang.
The workaround for Errata #i2037 is to limit the maximum read
request size and maximum payload size to 128 bytes. Add workaround
for Errata #i2037 here.
The errata and workaround is applicable only to AM65x SR 1.0 and
later versions of the silicon will have this fixed.
[1] -> https://www.ti.com/lit/er/sprz452i/sprz452i.pdf
Link: https://lore.kernel.org/linux-pci/16e1fcae-1ea7-46be-b157-096e05661b15@siemens.com
Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com>
Signed-off-by: Achal Verma <a-verma1@ti.com>
Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
The struct mobiveil_rp_ops is not modified in this driver.
Thus, make this struct constant, which also moves data to a read-only
section decreasing object size and also improving overall security.
On a x86_64, with allmodconfig, as an example:
Before:
======
text data bss dec hex filename
4446 336 32 4814 12ce drivers/pci/controller/mobiveil/pcie-layerscape-gen4.o
After:
=====
text data bss dec hex filename
4454 328 32 4814 12ce drivers/pci/controller/mobiveil/pcie-layerscape-gen4.o
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/189fd881cc8fd80220e74e91820e12cf3a5be114.1719260294.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
KFENCE reports the following UAF:
BUG: KFENCE: use-after-free read in __pci_enable_msi_range+0x2c0/0x488
Use-after-free read at 0x0000000024629571 (in kfence-#12):
__pci_enable_msi_range+0x2c0/0x488
pci_alloc_irq_vectors_affinity+0xec/0x14c
pci_alloc_irq_vectors+0x18/0x28
kfence-#12: 0x0000000008614900-0x00000000e06c228d, size=104, cache=kmalloc-128
allocated by task 81 on cpu 7 at 10.808142s:
__kmem_cache_alloc_node+0x1f0/0x2bc
kmalloc_trace+0x44/0x138
msi_alloc_desc+0x3c/0x9c
msi_domain_insert_msi_desc+0x30/0x78
msi_setup_msi_desc+0x13c/0x184
__pci_enable_msi_range+0x258/0x488
pci_alloc_irq_vectors_affinity+0xec/0x14c
pci_alloc_irq_vectors+0x18/0x28
freed by task 81 on cpu 7 at 10.811436s:
msi_domain_free_descs+0xd4/0x10c
msi_domain_free_locked.part.0+0xc0/0x1d8
msi_domain_alloc_irqs_all_locked+0xb4/0xbc
pci_msi_setup_msi_irqs+0x30/0x4c
__pci_enable_msi_range+0x2a8/0x488
pci_alloc_irq_vectors_affinity+0xec/0x14c
pci_alloc_irq_vectors+0x18/0x28
Descriptor allocation done in:
__pci_enable_msi_range
msi_capability_init
msi_setup_msi_desc
msi_insert_msi_desc
msi_domain_insert_msi_desc
msi_alloc_desc
...
Freed in case of failure in __msi_domain_alloc_locked()
__pci_enable_msi_range
msi_capability_init
pci_msi_setup_msi_irqs
msi_domain_alloc_irqs_all_locked
msi_domain_alloc_locked
__msi_domain_alloc_locked => fails
msi_domain_free_locked
...
That failure propagates back to pci_msi_setup_msi_irqs() in
msi_capability_init() which accesses the descriptor for unmasking in the
error exit path.
Cure it by copying the descriptor and using the copy for the error exit path
unmask operation.
[ tglx: Massaged change log ]
Fixes: bf6e054e0e3f ("genirq/msi: Provide msi_device_populate/destroy_sysfs()")
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Mostafa Saleh <smostafa@google.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Bjorn Heelgas <bhelgaas@google.com>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240624203729.1094506-1-smostafa@google.com
With ARCH=arm64, make allmodconfig && make W=1 C=1 reports:
WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/pci/hotplug/acpiphp_ampere_altra.o
Add the missing MODULE_DESCRIPTION().
Link: https://lore.kernel.org/r/20240612-md-drivers-pci-hotplug-v1-1-2b30d14d783d@quicinc.com
Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
During remove & rescan cycle, PCI subsystem will recalculate and adjust
the bridge window sizing that was initially done by "BIOS". The size
calculation is based on the required alignment of the largest resource
among the downstream resources as per pbus_size_mem() (unimportant or
zero parameters marked with "..."):
min_align = calculate_mem_align(aligns, max_order);
size0 = calculate_memsize(size, ..., min_align);
inside calculate_memsize(), for the largest alignment:
min_align = align1 >> 1;
...
return min_align;
and then in calculate_memsize():
return ALIGN(max(size, ...), align);
If the original bridge window sizing tried to conserve space, this will
lead to massive increase of the required bridge window size when the
downstream has a large disparity in BAR sizes. E.g., with 16MiB and
16GiB BARs this results in 24GiB bridge window size even if 16MiB BAR
does not require gigabytes of space to fit.
When doing remove & rescan for a bus that contains such a PCI device, a
larger bridge window is suddenly required on rescan but when there is a
bridge window upstream that is already assigned based on the original
size, it cannot be enlarged to the new requirement. This causes the
allocation of the bridge window to fail (0x600000000 > 0x400ffffff):
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]
pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref]
pci 0000:01:00.0: PCI bridge to [bus 02-04]
pci 0000:01:00.0: bridge window [mem 0x40400000-0x406fffff]
pci 0000:01:00.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref]
pci 0000:03:00.0: device released
pci 0000:02:01.0: device released
pcieport 0000:01:00.0: scanning [bus 02-04] behind bridge, pass 0
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]
pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref]
pci 0000:02:01.0: scanning [bus 03-03] behind bridge, pass 0
pci 0000:03:00.0: BAR 0 [mem 0x6400000000-0x6400ffffff 64bit pref]
pci 0000:03:00.0: BAR 2 [mem 0x6000000000-0x63ffffffff 64bit pref]
pci 0000:03:00.0: ROM [mem 0x40400000-0x405fffff pref]
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: scanning [bus 03-03] behind bridge, pass 1
pcieport 0000:01:00.0: scanning [bus 02-04] behind bridge, pass 1
pci 0000:02:01.0: bridge window [mem size 0x600000000 64bit pref]: can't assign; no space
pci 0000:02:01.0: bridge window [mem size 0x600000000 64bit pref]: failed to assign
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]: assigned
pci 0000:03:00.0: BAR 2 [mem size 0x400000000 64bit pref]: can't assign; no space
pci 0000:03:00.0: BAR 2 [mem size 0x400000000 64bit pref]: failed to assign
pci 0000:03:00.0: BAR 0 [mem size 0x01000000 64bit pref]: can't assign; no space
pci 0000:03:00.0: BAR 0 [mem size 0x01000000 64bit pref]: failed to assign
pci 0000:03:00.0: ROM [mem 0x40400000-0x405fffff pref]: assigned
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]
This is a major surprise for users who are suddenly left with a device that
was working fine with the original bridge window sizing.
Even if the already assigned bridge window could be enlarged by
reallocation in some cases (something the current code does not attempt
to do), it is not possible in general case and the large amount of
wasted space at the tail of the bridge window may lead to other
resource exhaustion problems on Root Complex level (think of multiple
PCIe cards with VFs and BAR size disparity in a single system).
PCI BARs only need natural alignment (PCIe r6.1, sec 7.5.1.2.1) and bridge
memory windows need 1MiB (sec 7.5.1.3). The current bridge window tail
alignment rule was introduced in the commit 5d0a8965aea9 ("[PATCH] 2.5.14:
New PCI allocation code (alpha, arm, parisc) [2/2]") that only states:
"pbus_size_mem: core stuff; tested with randomly generated sets of
resources". It does not explain the motivation for the extra tail space
allocated that is not truly needed by the downstream resources. As such, it
is far from clear if it ever has been required by any HW.
To prevent devices with BAR size disparity from becoming unusable after
remove & rescan cycle, attempt to do a truly minimal allocation for memory
resources if needed. First check if the normally calculated bridge window
will not fit into an already assigned upstream resource. In such case, try
with relaxed bridge window tail sizing rules instead where no extra tail
space is requested beyond what the downstream resources require. Only
enforce the alignment requirement of the bridge window itself (normally
1MiB).
With this patch, the resources are successfully allocated:
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: scanning [bus 03-03] behind bridge, pass 1
pcieport 0000:01:00.0: scanning [bus 02-04] behind bridge, pass 1
pcieport 0000:01:00.0: Assigned bridge window [mem 0x6000000000-0x6400ffffff 64bit pref] to [bus 02-04] cannot fit 0x600000000 required for 0000:02:01.0 bridging to [bus 03]
pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref] to [bus 03] requires relaxed alignment rules
pcieport 0000:01:00.0: Assigned bridge window [mem 0x40400000-0x406fffff] to [bus 02-04] free space at [mem 0x40400000-0x405fffff]
pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref]: assigned
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]: assigned
pci 0000:03:00.0: BAR 2 [mem 0x6000000000-0x63ffffffff 64bit pref]: assigned
pci 0000:03:00.0: BAR 0 [mem 0x6400000000-0x6400ffffff 64bit pref]: assigned
pci 0000:03:00.0: ROM [mem 0x40400000-0x405fffff pref]: assigned
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0: bridge window [mem 0x40400000-0x405fffff]
pci 0000:02:01.0: bridge window [mem 0x6000000000-0x6400ffffff 64bit pref]
This patch draws inspiration from the initial investigations and work by
Mika Westerberg.
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=216795
Link: https://lore.kernel.org/linux-pci/20190812144144.2646-1-mika.westerberg@linux.intel.com/
Fixes: 5d0a8965aea9 ("[PATCH] 2.5.14: New PCI allocation code (alpha, arm, parisc) [2/2]")
Link: https://lore.kernel.org/r/20240507102523.57320-9-ilpo.jarvinen@linux.intel.com
Tested-by: Lidong Wang <lidong.wang@intel.com>
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Add a PCI power control driver that's capable of correctly powering up
devices using the power sequencing subsystem. The first users of this
driver are the ath11k module on QCA6390 and ath12k on WCN7850. These
packages require a certain delay between enabling the Bluetooth and WLAN
modules and the power sequencing subsystem takes care of it behind the
scenes.
Tested-by: Amit Pundir <amit.pundir@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD, SM8650-QRD & SM8650-HDK
Tested-by: Caleb Connolly <caleb.connolly@linaro.org> # OnePlus 8T
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20240612082019.19161-6-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Some PCI devices must be powered-on before they can be detected on the
bus. Introduce a simple framework reusing the existing PCI OF
infrastructure.
The way this works is: a DT node representing a PCI device connected to
the port can be matched against its power control platform driver. If
the match succeeds, the driver is responsible for powering-up the device
and calling pci_pwrctl_device_set_ready() which will trigger a PCI bus
rescan as well as subscribe to PCI bus notifications.
When the device is detected and created, we'll make it consume the same
DT node that the platform device did. When the device is bound, we'll
create a device link between it and the parent power control device.
Tested-by: Amit Pundir <amit.pundir@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD, SM8650-QRD & SM8650-HDK
Tested-by: Caleb Connolly <caleb.connolly@linaro.org> # OnePlus 8T
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20240612082019.19161-5-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
In preparation for introducing PCI device power control - a set of
library functions that will allow powering-up of PCI devices before
they're detected on the PCI bus - we need to populate the devices
defined on the device-tree.
We are reusing the platform bus as it provides us with all the
infrastructure we need to match the pwrctl drivers against the
compatibles from OF nodes.
These platform devices will be probed by the driver core and bound to
the PCI pwrctl drivers we'll introduce later.
Tested-by: Amit Pundir <amit.pundir@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD, SM8650-QRD & SM8650-HDK
Tested-by: Caleb Connolly <caleb.connolly@linaro.org> # OnePlus 8T
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20240612082019.19161-4-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
With PCI power control we deal with two struct device objects bound to
two different drivers but consuming the same OF node. We must not bind
the pinctrl twice. To that end: before setting the OF node of the newly
instantiated PCI device, check if a platform device consuming the same
OF node doesn't already exist on the platform bus and - if so - mark the
PCI device as reusing the OF node.
Tested-by: Amit Pundir <amit.pundir@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD, SM8650-QRD & SM8650-HDK
Tested-by: Caleb Connolly <caleb.connolly@linaro.org> # OnePlus 8T
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20240612082019.19161-3-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
With the introduction of PCI device power control drivers that will be
able to trigger the port rescan when probing, we need to hold the rescan
mutex during the initial pci_host_probe() too or the two could get in
each other's way.
Tested-by: Amit Pundir <amit.pundir@linaro.org>
Tested-by: Neil Armstrong <neil.armstrong@linaro.org> # on SM8550-QRD, SM8650-QRD & SM8650-HDK
Tested-by: Caleb Connolly <caleb.connolly@linaro.org> # OnePlus 8T
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Link: https://lore.kernel.org/r/20240612082019.19161-2-brgl@bgdev.pl
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
Now that the driver core allows for struct class to be in read-only memory,
we should make all 'class' structures declared at build time placing them
into read-only memory, instead of having to be dynamically allocated at
runtime.
Link: https://lore.kernel.org/r/2024061053-online-unwound-b173@gregkh
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-By: Logan Gunthorpe <logang@deltatee.com>
Cc: Kurt Schwemmer <kurt.schwemmer@microsemi.com>
Cc: Jon Mason <jdmason@kudzu.us>
Cc: Allen Hubbe <allenbh@gmail.com>
While 'x' and '&x[0]' are equivalent, most of the PCI drivers use the
former form for the .id_table.
Update some drivers and documentation for consistency.
Link: https://lore.kernel.org/r/20240517120458.1260489-1-masahiroy@kernel.org
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
'tegra_pcie_soc' has been unused since 56e15a238d92 ("PCI: tegra: Add
Tegra194 PCIe support"). Remove it.
Link: https://lore.kernel.org/r/20240527160118.37069-1-linux@treblig.org
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
While the experiment did reveal that there are additional places that are
missing the lock during secondary bus reset, one of the places that needs
to take cfg_access_lock (pci_bus_lock()) is not prepared for lockdep
annotation.
Specifically, pci_bus_lock() takes pci_dev_lock() recursively and is
currently dependent on the fact that the device_lock() is marked
lockdep_set_novalidate_class(&dev->mutex). Otherwise, without that
annotation, pci_bus_lock() would need to use something like a new
pci_dev_lock_nested() helper, a scheme to track a PCI device's depth in the
topology, and a hope that the depth of a PCI tree never exceeds the max
value for a lockdep subclass.
The alternative to ripping out the lockdep coverage would be to deploy a
dynamic lock key for every PCI device. Unfortunately, there is evidence
that increasing the number of keys that lockdep needs to track to be
per-PCI-device is prohibitively expensive for something like the
cfg_access_lock.
The main motivation for adding the annotation in the first place was to
catch unlocked secondary bus resets, not necessarily catch lock ordering
problems between cfg_access_lock and other locks. Solve that narrower
problem with follow-on patches, and just due to targeted revert for now.
Link: https://lore.kernel.org/r/171711746402.1628941.14575335981264103013.stgit@dwillia2-xfh.jf.intel.com
Fixes: 7e89efc6e9e4 ("PCI: Lock upstream bridge for pci_reset_function()")
Reported-by: Imre Deak <imre.deak@intel.com>
Closes: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_134186v1/shard-dg2-1/igt@device_reset@unbind-reset-rebind.html
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Hans de Goede <hdegoede@redhat.com>
Tested-by: Kalle Valo <kvalo@kernel.org>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Cc: Jani Saarinen <jani.saarinen@intel.com>
Use preserve_config in place of checking for PCI_PROBE_ONLY flag to enable
support for "linux,pci-probe-only" on a per host bridge basis.
This also obviates the use of adding PCI_REASSIGN_ALL_BUS flag if
!PCI_PROBE_ONLY, as pci_assign_unassigned_root_bus_resources() takes care
of reassigning the resources that are not already claimed.
Link: https://lore.kernel.org/r/20240508174138.3630283-5-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add of_pci_preserve_config() to look for the "linux,pci-probe-only"
property under a specified node. If it's not found there, look under
"of_chosen" in addition.
If the caller didn't specify a node, look under "of_chosen".
With a future patch, this will support "linux,pci-probe-only" on a per host
bridge basis based on the presence of the property in the respective PCI
host bridge DT node.
Implement of_pci_check_probe_only() using of_pci_preserve_config().
Link: https://lore.kernel.org/r/20240508174138.3630283-3-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Move the PRESERVE_BOOT_CONFIG _DSM evaluation from acpi_pci_root_create()
to pci_register_host_bridge().
This will help unify the ACPI _DSM path and the DT-based
"linux,pci-probe-only" paths.
This should be safe because it happens earlier than it used to:
acpi_pci_root_create
pci_create_root_bus
pci_register_host_bridge
+ bridge->preserve_config = pci_preserve_config(bridge)
pci_acpi_preserve_config
+ acpi_evaluate_dsm_typed(DSM_PCI_PRESERVE_BOOT_CONFIG)
- acpi_evaluate_dsm_typed(DSM_PCI_PRESERVE_BOOT_CONFIG)
No functional change intended.
Link: https://lore.kernel.org/r/20240508174138.3630283-2-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Ricky reports that replacing a device in a hotplug slot during ACPI sleep
state S3 does not cause re-enumeration on resume, as one would expect.
Instead, the new device is treated as if it was the old one.
There is no bulletproof way to detect device replacement, but as a
heuristic, check whether the device identity in config space matches cached
data in struct pci_dev (Vendor ID, Device ID, Class Code, Revision ID,
Subsystem Vendor ID, Subsystem ID). Additionally, cache and compare the
Device Serial Number (PCIe r6.2 sec 7.9.3). If a mismatch is detected,
mark the old device disconnected (to prevent its driver from accessing the
new device) and synthesize a Presence Detect Changed event.
The device identity in config space which is compared here is the same as
the one included in the signed Subject Alternative Name per PCIe r6.1 sec
6.31.3. Thus, the present commit prevents attacks where a valid device is
replaced with a malicious device during system sleep and the valid device's
driver obliviously accesses the malicious device.
This is about as much as can be done at the PCI layer. Drivers may have
additional ways to identify devices (such as reading a WWID from some
register) and may trigger re-enumeration when detecting an identity change
on resume.
Link: https://lore.kernel.org/r/a1afaa12f341d146ecbea27c1743661c71683833.1716992815.git.lukas@wunner.de
Reported-by: Ricky Wu <ricky_wu@realtek.com>
Closes: https://lore.kernel.org/r/a608b5930d0a48f092f717c0e137454b@realtek.com
Tested-by: Ricky Wu <ricky_wu@realtek.com>
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Per PCIe r6.0, sec 5.2, a Link Down event can happen under any of the
following circumstances:
1. Fundamental/Hot reset
2. Link disable transmission by upstream component
3. Moving from L2/L3 to L0
When the event happens, the EPC driver capable of detecting it may pass the
notification to the EPF driver through link_down() callback in 'struct
pci_epc_event_ops'.
While the PCIe spec has not defined the actual behavior of the endpoint
when the Link Down event happens, we may assume that at least the ongoing
transactions need to be stopped as the link won't be active, so
cancel the command handler work in the callback implementation
pci_epf_test_link_down(). The work will be started again in
pci_epf_test_link_up() once the link comes back again.
Link: https://lore.kernel.org/linux-pci/20240430-pci-epf-rework-v4-10-22832d0d456f@linaro.org
Tested-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: update spec citation]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Tegra194 and Tegra234 PCIe EP controllers have 64K alignment restriction
for the inbound ATU. Set the endpoint inbound ATU alignment to 64kB in the
Tegra194 PCIe driver.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Suggested-by: Manikanta Maddireddy <mmaddireddy@nvidia.com>
Link: https://lore.kernel.org/linux-pci/20240508092207.337063-1-jonathanh@nvidia.com
Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Avoid large backtrace, it is sufficient to warn the user that there has
been a link problem. Either the link has failed and the system is in need
of maintenance, or the link continues to work and user has been informed.
The message from the warning can be looked up in the sources.
This makes an actual link issue less verbose.
First of all, this controller has a limitation in that the controller
driver has to assist the hardware with transition to L1 link state by
writing L1IATN to PMCTRL register, the L1 and L0 link state switching
is not fully automatic on this controller.
In case of an ASMedia ASM1062 PCIe SATA controller which does not support
ASPM, on entry to suspend or during platform pm_test, the SATA controller
enters D3hot state and the link enters L1 state. If the SATA controller
wakes up before rcar_pcie_wakeup() was called and returns to D0, the link
returns to L0 before the controller driver even started its transition to
L1 link state. At this point, the SATA controller did send an PM_ENTER_L1
DLLP to the PCIe controller and the PCIe controller received it, and the
PCIe controller did set PMSR PMEL1RX bit.
Once rcar_pcie_wakeup() is called, if the link is already back in L0 state
and PMEL1RX bit is set, the controller driver has no way to determine if
it should perform the link transition to L1 state, or treat the link as if
it is in L0 state. Currently the driver attempts to perform the transition
to L1 link state unconditionally, which in this specific case fails with a
PMSR L1FAEG poll timeout, however the link still works as it is already
back in L0 state.
Reduce this warning verbosity. In case the link is really broken, the
rcar_pcie_config_access() would fail, otherwise it will succeed and any
system with this controller and ASM1062 can suspend without generating
a backtrace.
Fixes: 84b576146294 ("PCI: rcar: Finish transition to L1 state in rcar_pcie_config_access()")
Link: https://lore.kernel.org/linux-pci/20240511235513.77301-1-marek.vasut+renesas@mailbox.org
Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Add the PCIE_RESET_CONFIG_DEVICE_WAIT_MS macro to define the minimum
waiting time between exit from a conventional reset and sending the
first configuration request to the device.
As described in PCIe r6.0, sec 6.6.1 <Conventional Reset>, there are two
different use cases of the value:
- "With a Downstream Port that does not support Link speeds greater
than 5.0 GT/s, software must wait a minimum of 100 ms following exit
from a Conventional Reset before sending a Configuration Request to
the device immediately below that Port."
- "With a Downstream Port that supports Link speeds greater than
5.0 GT/s, software must wait a minimum of 100 ms after Link training
completes before sending a Configuration Request to the device
immediately below that Port."
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-21-minda.chen@starfivetech.com
Signed-off-by: Kevin Xie <kevin.xie@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mason Huo <mason.huo@starfivetech.com>
plda_pcie_setup_iomems() needs the bridge->windows list from struct
pci_host_bridge and is currently used only by pcie-microchip-host.c. This
driver uses pci_host_common_probe(), which sets a pci_host_bridge as the
drvdata, so plda_pcie_setup_iomems() used platform_get_drvdata() to find
the pci_host_bridge.
But we also want to use plda_pcie_setup_iomems() in the new pcie-starfive.c
driver, which does not use pci_host_common_probe() and will have struct
starfive_jh7110_pcie as its drvdata, so pass the pci_host_bridge directly
to plda_pcie_setup_iomems() so it doesn't need platform_get_drvdata() to
find it.
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-9-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
[bhelgaas: commit log, reorder to where this is needed]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Add PLDA host plda_pcie_host_init()/plda_pcie_host_deinit() and map bus
function so vendors can use it to init PLDA PCIe host core.
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-19-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Mason Huo <mason.huo@starfivetech.com>
PLDA DMA interrupts are not all implemented, and the non-implemented
interrupts should be masked. Add a bitmap field to mask the non-implemented
interrupts.
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-18-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
As PLDA DT binding doc (Documentation/devicetree/bindings/pci/
plda,xpressrich3-axi-common.yaml) showed, PLDA PCIe contains an interrupt
controller.
PolarFire implements its own PCIe interrupts, additional to the regular
PCIe interrupts, due to lack of an MSI controller, so the interrupt to
event number mapping is different to the PLDA regular interrupts,
necessitating a custom get_events() implementation.
Microchip PolarFire PCIe additional interrupts (defined in
drivers/pci/controller/plda/pcie-microchip-host.c):
EVENT_PCIE_L2_EXIT
EVENT_PCIE_HOTRST_EXIT
EVENT_PCIE_DLUP_EXIT
EVENT_SEC_TX_RAM_SEC_ERR
EVENT_SEC_RX_RAM_SEC_ERR
...
plda_get_events() adds interrupt register to PLDA event num mapping codes.
All the PLDA interrupts can be seen in new added graph.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-15-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
The INTx and MSI interrupt event num is different across platforms, so
add two event num fields in struct plda_event.
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-14-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
As the PLDA DT binding doc (Documentation/devicetree/bindings/pci/
plda,xpressrich3-axi-common.yaml) shows, the PLDA IP contains an interrupt
controller. Microchip PolarFire add some interrupts based on PLDA interrupt
controller.
The Microchip PolarFire PCIe additional interrupts (defined in
drivers/pci/controller/plda/pcie-microchip-host.c):
EVENT_PCIE_L2_EXIT
EVENT_PCIE_HOTRST_EXIT
EVENT_PCIE_DLUP_EXIT
EVENT_SEC_TX_RAM_SEC_ERR
EVENT_SEC_RX_RAM_SEC_ERR
...
Both event_cause[] and mc_event_handler() contain additional interrupt
symbol names; these can not be re-used. Add a new plda_event_handler()
function, which implements PLDA interrupt defalt handler, and add a
request_event_irq() callback function for Microchip PolarFire additional
interrupts.
[kwilczynski, bhelgaas: commit log]
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-13-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
The number of events is different across platforms. In order to share
interrupt processing code, add a variable that defines the number of
events so that it can be set per-platform instead of hardcoding it.
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-12-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Rename mc_* to plda_* for IRQ functions and related IRQ domain ops data
instances.
MSI, INTx interrupt code and IRQ init code can all be re-used.
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-11-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Acked-by: Conor Dooley <conor.dooley@microchip.com>
Move plda_pcie_setup_window() and plda_pcie_setup_iomems() to
pcie-plda-host.c so they can be shared by all PLDA-based drivers.
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-10-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Rename mc_pcie_setup_window() to plda_pcie_setup_window() and
mc_pcie_setup_windows() to plda_pcie_setup_iomems() so they can be shared
by all PLDA-based drivers.
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-8-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Move the PLDA generic data structures to a header file so they can be
re-used by all PLDA-based drivers.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-7-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Rename struct mc_msi to plda_msi and move most of struct mc_pcie to a new
struct plda_pcie_rp so they can be shared by all PLDA-based drivers.
The axi_base_addr field remains in struct mc_pcie since it's
Microchip-specific data.
The event interrupt code is still using struct mc_pcie because the event
interrupt code can not be re-used.
[kwilczynski: commit log]
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-6-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Bridge address base is common PLDA field, add this to struct mc_pcie first.
INTx and MSI interrupt code will be changed to common code, so get the
bridge base address from port->bridge_addr instead of axi_base_addr.
The axi_base_addr is Microchip-specific data.
Link: https://lore.kernel.org/linux-pci/20240328091835.14797-5-minda.chen@starfivetech.com
Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>