linux

iv/linux

Author	SHA1	Message	Date
Rafael J. Wysocki	736be35671	PCI: Use pci_update_current_state() in pci_enable_device_flags() [ Upstream commit 14858dcc3b3587f4bb5c48e130ee7d68fc2b0a29 ] Updating the current_state field of struct pci_dev the way it is done in pci_enable_device_flags() before calling do_pci_enable_device() may not work. For example, if the given PCI device depends on an ACPI power resource whose _STA method initially returns 0 ("off"), but the config space of the PCI device is accessible and the power state retrieved from the PCI_PM_CTRL register is D0, the current_state field in the struct pci_dev representing that device will get out of sync with the power.state of its ACPI companion object and that will lead to power management issues going forward. To avoid such issues, make pci_enable_device_flags() call pci_update_current_state() which takes ACPI device power management into account, if present, to retrieve the current power state of the device. Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/ Reported-by: Maximilian Luz <luzmaximilian@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Tested-by: Maximilian Luz <luzmaximilian@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-09-22 11:43:04 +02:00
Krzysztof Wilczyński	69d00a7db3	PCI: Return ~0 data on pciconfig_read() CAP_SYS_ADMIN failure commit a8bd29bd49c4156ea0ec5a97812333e2aeef44e7 upstream. The pciconfig_read() syscall reads PCI configuration space using hardware-dependent config accessors. If the read fails on PCI, most accessors don't return an error; they pretend the read was successful and got ~0 data from the device, so the syscall returns success with ~0 data in the buffer. When the accessor does return an error, pciconfig_read() normally fills the user's buffer with ~0 and returns an error in errno. But after e4585da22ad0 ("pci syscall.c: Switch to refcounting API"), we don't fill the buffer with ~0 for the EPERM "user lacks CAP_SYS_ADMIN" error. Userspace may rely on the ~0 data to detect errors, but after e4585da22ad0, that would not detect CAP_SYS_ADMIN errors. Restore the original behaviour of filling the buffer with ~0 when the CAP_SYS_ADMIN check fails. [bhelgaas: commit log, fold in Nathan's fix https://lore.kernel.org/r/20210803200836.500658-1-nathan@kernel.org] Fixes: e4585da22ad0 ("pci syscall.c: Switch to refcounting API") Link: https://lore.kernel.org/r/20210729233755.1509616-1-kw@linux.com Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-22 11:43:04 +02:00
Marek Behún	6fb6e7b3dc	PCI: Restrict ASMedia ASM1062 SATA Max Payload Size Supported commit b12d93e9958e028856cbcb061b6e64728ca07755 upstream. The ASMedia ASM1062 SATA controller advertises Max_Payload_Size_Supported of 512, but in fact it cannot handle incoming TLPs with payload size of 512. We discovered this issue on PCIe controllers capable of MPS = 512 (Aardvark and DesignWare), where the issue presents itself as an External Abort. Bjorn Helgaas says: Probably ASM1062 reports a Malformed TLP error when it receives a data payload of 512 bytes, and Aardvark, DesignWare, etc convert this to an arm64 External Abort. [1] To avoid this problem, limit the ASM1062 Max Payload Size Supported to 256 bytes, so we set the Max Payload Size of devices that may send TLPs to the ASM1062 to 256 or less. [1] https://lore.kernel.org/linux-pci/20210601170907.GA1949035@bjorn-Precision-5520/ BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=212695 Link: https://lore.kernel.org/r/20210624171418.27194-2-kabel@kernel.org Reported-by: Rötti <espressobinboardarmbiantempmailaddress@posteo.de> Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Krzysztof Wilczyński <kw@linux.com> Reviewed-by: Pali Rohár <pali@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-22 11:43:03 +02:00
Marek Marczykowski-Górecki	04bc1b5368	PCI/MSI: Skip masking MSI-X on Xen PV commit 1a519dc7a73c977547d8b5108d98c6e769c89f4b upstream. When running as Xen PV guest, masking MSI-X is a responsibility of the hypervisor. The guest has no write access to the relevant BAR at all - when it tries to, it results in a crash like this: BUG: unable to handle page fault for address: ffffc9004069100c #PF: supervisor write access in kernel mode #PF: error_code(0x0003) - permissions violation RIP: e030:__pci_enable_msix_range.part.0+0x26b/0x5f0 e1000e_set_interrupt_capability+0xbf/0xd0 [e1000e] e1000_probe+0x41f/0xdb0 [e1000e] local_pci_probe+0x42/0x80 (...) The recently introduced function msix_mask_all() does not check the global variable pci_msi_ignore_mask which is set by XEN PV to bypass the masking of MSI[-X] interrupts. Add the check to make this function XEN PV compatible. Fixes: 7d5ec3d36123 ("PCI/MSI: Mask all unused MSI-X entries") Signed-off-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210826170342.135172-1-marmarek@invisiblethingslab.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-22 11:43:03 +02:00
Rafael J. Wysocki	47f32ededb	PCI: PM: Enable PME if it can be signaled from D3cold [ Upstream commit 0e00392a895c95c6d12d42158236c8862a2f43f2 ] PME signaling is only enabled by __pci_enable_wake() if the target device can signal PME from the given target power state (to avoid pointless reconfiguration of the device), but if the hierarchy above the device goes into D3cold, the device itself will end up in D3cold too, so if it can signal PME from D3cold, it should be enabled to do so in __pci_enable_wake(). [Note that if the device does not end up in D3cold and it cannot signal PME from the original target power state, it will not signal PME, so in that case the behavior does not change.] Link: https://lore.kernel.org/linux-pm/3149540.aeNJFYEL58@kreacher/ Fixes: 5bcc2fb4e815 ("PCI PM: Simplify PCI wake-up code") Reported-by: Mika Westerberg <mika.westerberg@linux.intel.com> Reported-by: Utkarsh H Patel <utkarsh.h.patel@intel.com> Reported-by: Koba Ko <koba.ko@canonical.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Tested-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-09-22 11:43:00 +02:00
Marek Behún	ad87dbb60f	PCI: Call Max Payload Size-related fixup quirks early commit b8da302e2955fe4d41eb9d48199242674d77dbe0 upstream. pci_device_add() calls HEADER fixups after pci_configure_device(), which configures Max Payload Size. Convert MPS-related fixups to EARLY fixups so pci_configure_mps() takes them into account. Fixes: 27d868b5e6cfa ("PCI: Set MPS to match upstream bridge") Link: https://lore.kernel.org/r/20210624171418.27194-1-kabel@kernel.org Signed-off-by: Marek Behún <kabel@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-09-22 11:42:58 +02:00
Thomas Gleixner	97987e146e	PCI/MSI: Enforce MSI[X] entry updates to be visible commit b9255a7cb51754e8d2645b65dd31805e282b4f3e upstream. Nothing enforces the posted writes to be visible when the function returns. Flush them even if the flush might be redundant when the entry is masked already as the unmask will flush as well. This is either setup or a rare affinity change event so the extra flush is not the end of the world. While this is more a theoretical issue especially the logic in the X86 specific msi_set_affinity() function relies on the assumption that the update has reached the hardware when the function returns. Again, as this never has been enforced the Fixes tag refers to a commit in: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.515188147@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-08-26 08:37:24 -04:00
Thomas Gleixner	6a91749e39	PCI/MSI: Enforce that MSI-X table entry is masked for update commit da181dc974ad667579baece33c2c8d2d1e4558d5 upstream. The specification (PCIe r5.0, sec 6.1.4.5) states: For MSI-X, a function is permitted to cache Address and Data values from unmasked MSI-X Table entries. However, anytime software unmasks a currently masked MSI-X Table entry either by clearing its Mask bit or by clearing the Function Mask bit, the function must update any Address or Data values that it cached from that entry. If software changes the Address or Data value of an entry while the entry is unmasked, the result is undefined. The Linux kernel's MSI-X support never enforced that the entry is masked before the entry is modified hence the Fixes tag refers to a commit in: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Enforce the entry to be masked across the update. There is no point in enforcing this to be handled at all possible call sites as this is just pointless code duplication and the common update function is the obvious place to enforce this. Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support") Reported-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.462096385@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-08-26 08:37:24 -04:00
Thomas Gleixner	3590d16b47	PCI/MSI: Mask all unused MSI-X entries commit 7d5ec3d3612396dc6d4b76366d20ab9fc06f399f upstream. When MSI-X is enabled the ordering of calls is: msix_map_region(); msix_setup_entries(); pci_msi_setup_msi_irqs(); msix_program_entries(); This has a few interesting issues: 1) msix_setup_entries() allocates the MSI descriptors and initializes them except for the msi_desc:masked member which is left zero initialized. 2) pci_msi_setup_msi_irqs() allocates the interrupt descriptors and sets up the MSI interrupts which ends up in pci_write_msi_msg() unless the interrupt chip provides its own irq_write_msi_msg() function. 3) msix_program_entries() does not do what the name suggests. It solely updates the entries array (if not NULL) and initializes the masked member for each MSI descriptor by reading the hardware state and then masks the entry. Obviously this has some issues: 1) The uninitialized masked member of msi_desc prevents the enforcement of masking the entry in pci_write_msi_msg() depending on the cached masked bit. Aside of that half initialized data is a NONO in general 2) msix_program_entries() only ensures that the actually allocated entries are masked. This is wrong as experimentation with crash testing and crash kernel kexec has shown. This limited testing unearthed that when the production kernel had more entries in use and unmasked when it crashed and the crash kernel allocated a smaller amount of entries, then a full scan of all entries found unmasked entries which were in use in the production kernel. This is obviously a device or emulation issue as the device reset should mask all MSI-X table entries, but obviously that's just part of the paper specification. Cure this by: 1) Masking all table entries in hardware 2) Initializing msi_desc::masked in msix_setup_entries() 3) Removing the mask dance in msix_program_entries() 4) Renaming msix_program_entries() to msix_update_entries() to reflect the purpose of that function. As the masking of unused entries has never been done the Fixes tag refers to a commit in: git://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git Fixes: f036d4ea5fa7 ("[PATCH] ia32 Message Signalled Interrupt support") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.403833459@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-08-26 08:37:24 -04:00
Thomas Gleixner	cb93b6c52b	PCI/MSI: Protect msi_desc::masked for multi-MSI commit 77e89afc25f30abd56e76a809ee2884d7c1b63ce upstream. Multi-MSI uses a single MSI descriptor and there is a single mask register when the device supports per vector masking. To avoid reading back the mask register the value is cached in the MSI descriptor and updates are done by clearing and setting bits in the cache and writing it to the device. But nothing protects msi_desc::masked and the mask register from being modified concurrently on two different CPUs for two different Linux interrupts which belong to the same multi-MSI descriptor. Add a lock to struct device and protect any operation on the mask and the mask register with it. This makes the update of msi_desc::masked unconditional, but there is no place which requires a modification of the hardware register without updating the masked cache. msi_mask_irq() is now an empty wrapper which will be cleaned up in follow up changes. The problem goes way back to the initial support of multi-MSI, but picking the commit which introduced the mask cache is a valid cut off point (2.6.30). Fixes: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.726833414@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-08-26 08:37:24 -04:00
Thomas Gleixner	9c2266da9a	PCI/MSI: Use msi_mask_irq() in pci_msi_shutdown() commit d28d4ad2a1aef27458b3383725bb179beb8d015c upstream. No point in using the raw write function from shutdown. Preparatory change to introduce proper serialization for the msi_desc::masked cache. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.674391354@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-08-26 08:37:24 -04:00
Thomas Gleixner	73a00b002f	PCI/MSI: Correct misleading comments commit 689e6b5351573c38ccf92a0dd8b3e2c2241e4aff upstream. The comments about preserving the cached state in pci_msi[x]_shutdown() are misleading as the MSI descriptors are freed right after those functions return. So there is nothing to restore. Preparatory change. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.621609423@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-08-26 08:37:23 -04:00
Thomas Gleixner	4e1ec390ff	PCI/MSI: Do not set invalid bits in MSI mask commit 361fd37397f77578735907341579397d5bed0a2d upstream. msi_mask_irq() takes a mask and a flags argument. The mask argument is used to mask out bits from the cached mask and the flags argument to set bits. Some places invoke it with a flags argument which sets bits which are not used by the device, i.e. when the device supports up to 8 vectors a full unmask in some places sets the mask to 0xFFFFFF00. While devices probably do not care, it's still bad practice. Fixes: 7ba1930db02f ("PCI MSI: Unmask MSI if setup failed") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Marc Zyngier <maz@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.568173099@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-08-26 08:37:23 -04:00
Thomas Gleixner	130b04aff3	PCI/MSI: Enable and mask MSI-X early commit 438553958ba19296663c6d6583d208dfb6792830 upstream. The ordering of MSI-X enable in hardware is dysfunctional: 1) MSI-X is disabled in the control register 2) Various setup functions 3) pci_msi_setup_msi_irqs() is invoked which ends up accessing the MSI-X table entries 4) MSI-X is enabled and masked in the control register with the comment that enabling is required for some hardware to access the MSI-X table Step #4 obviously contradicts #3. The history of this is an issue with the NIU hardware. When #4 was introduced the table access actually happened in msix_program_entries() which was invoked after enabling and masking MSI-X. This was changed in commit d71d6432e105 ("PCI/MSI: Kill redundant call of irq_set_msi_desc() for MSI-X interrupts") which removed the table write from msix_program_entries(). Interestingly enough nobody noticed and either NIU still works or it did not get any testing with a kernel 3.19 or later. Nevertheless this is inconsistent and there is no reason why MSI-X can't be enabled and masked in the control register early on, i.e. move step #4 above to step #1. This preserves the NIU workaround and has no side effects on other hardware. Fixes: d71d6432e105 ("PCI/MSI: Kill redundant call of irq_set_msi_desc() for MSI-X interrupts") Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Marc Zyngier <maz@kernel.org> Reviewed-by: Ashok Raj <ashok.raj@intel.com> Reviewed-by: Marc Zyngier <maz@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210729222542.344136412@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-08-26 08:37:23 -04:00
Krzysztof Wilczyński	fb25a515f0	PCI/sysfs: Fix dsm_label_utf16s_to_utf8s() buffer overrun [ Upstream commit bdcdaa13ad96f1a530711c29e6d4b8311eff767c ] "utf16s_to_utf8s(..., buf, PAGE_SIZE)" puts up to PAGE_SIZE bytes into "buf" and returns the number of bytes it actually put there. If it wrote PAGE_SIZE bytes, the newline added by dsm_label_utf16s_to_utf8s() would overrun "buf". Reduce the size available for utf16s_to_utf8s() to use so there is always space for the newline. [bhelgaas: reorder patch in series, commit log] Fixes: 6058989bad05 ("PCI: Export ACPI _DSM provided firmware instance number and string name to sysfs") Link: https://lore.kernel.org/r/20210603000112.703037-7-kw@linux.com Reported-by: Joe Perches <joe@perches.com> Signed-off-by: Krzysztof Wilczyński <kw@linux.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-07-20 16:21:14 +02:00
Rafael J. Wysocki	97f709333b	Revert "PCI: PM: Do not read power state in pci_enable_device_flags()" [ Upstream commit 4d6035f9bf4ea12776322746a216e856dfe46698 ] Revert commit 4514d991d992 ("PCI: PM: Do not read power state in pci_enable_device_flags()") that is reported to cause PCI device initialization issues on some systems. BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=213481 Link: https://lore.kernel.org/linux-acpi/YNDoGICcg0V8HhpQ@eldamar.lan Reported-by: Michael <phyre@rogers.com> Reported-by: Salvatore Bonaccorso <carnil@debian.org> Fixes: 4514d991d992 ("PCI: PM: Do not read power state in pci_enable_device_flags()") Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-06-30 08:49:19 -04:00
Shanker Donthineni	2f0969c0be	PCI: Mark some NVIDIA GPUs to avoid bus reset commit 4c207e7121fa92b66bf1896bf8ccb9edfb0f9731 upstream. Some NVIDIA GPU devices do not work with SBR. Triggering SBR leaves the device inoperable for the current system boot. It requires a system hard-reboot to get the GPU device back to normal operating condition post-SBR. For the affected devices, enable NO_BUS_RESET quirk to avoid the issue. This issue will be fixed in the next generation of hardware. Link: https://lore.kernel.org/r/20210608054857.18963-8-ameynarkhede03@gmail.com Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Sinan Kaya <okaya@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-30 08:49:17 -04:00
Antti Järvinen	d71d336213	PCI: Mark TI C667X to avoid bus reset commit b5cf198e74a91073d12839a3e2db99994a39995d upstream. Some TI KeyStone C667X devices do not support bus/hot reset. The PCIESS automatically disables LTSSM when Secondary Bus Reset is received and device stops working. Prevent bus reset for these devices. With this change, the device can be assigned to VMs with VFIO, but it will leak state between VMs. Reference: https://e2e.ti.com/support/processors/f/791/t/954382 Link: https://lore.kernel.org/r/20210315102606.17153-1-antti.jarvinen@gmail.com Signed-off-by: Antti Järvinen <antti.jarvinen@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Kishon Vijay Abraham I <kishon@ti.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-06-30 08:49:17 -04:00
Feilong Lin	f701705774	ACPI / hotplug / PCI: Fix reference count leak in enable_slot() [ Upstream commit 3bbfd319034ddce59e023837a4aa11439460509b ] In enable_slot(), if pci_get_slot() returns NULL, we clear the SLOT_ENABLED flag. When pci_get_slot() finds a device, it increments the device's reference count. In this case, we did not call pci_dev_put() to decrement the reference count, so the memory of the device (struct pci_dev type) will eventually leak. Call pci_dev_put() to decrement its reference count when pci_get_slot() returns a PCI device. Link: https://lore.kernel.org/r/b411af88-5049-a1c6-83ac-d104a1f429be@huawei.com Signed-off-by: Feilong Lin <linfeilong@huawei.com> Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-05-22 10:40:34 +02:00
Dmitry Baryshkov	c864ded837	PCI: Release OF node in pci_scan_device()'s error path [ Upstream commit c99e755a4a4c165cad6effb39faffd0f3377c02d ] In pci_scan_device(), if pci_setup_device() fails for any reason, the code will not release device's of_node by calling pci_release_of_node(). Fix that by calling the release function. Fixes: 98d9f30c820d ("pci/of: Match PCI devices to OF nodes dynamically") Link: https://lore.kernel.org/r/20210124232826.1879-1-dmitry.baryshkov@linaro.org Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-05-22 10:40:30 +02:00
Rafael J. Wysocki	80d37f2483	PCI: PM: Do not read power state in pci_enable_device_flags() [ Upstream commit 4514d991d99211f225d83b7e640285f29f0755d0 ] It should not be necessary to update the current_state field of struct pci_dev in pci_enable_device_flags() before calling do_pci_enable_device() for the device, because none of the code between that point and the pci_set_power_state() call in do_pci_enable_device() invoked later depends on it. Moreover, doing that is actively harmful in some cases. For example, if the given PCI device depends on an ACPI power resource whose _STA method initially returns 0 ("off"), but the config space of the PCI device is accessible and the power state retrieved from the PCI_PM_CTRL register is D0, the current_state field in the struct pci_dev representing that device will get out of sync with the power.state of its ACPI companion object and that will lead to power management issues going forward. To avoid such issues it is better to leave the current_state value as is until it is changed to PCI_D0 by do_pci_enable_device() as appropriate. However, the power state of the device is not changed to PCI_D0 if it is already enabled when pci_enable_device_flags() gets called for it, so update its current_state in that case, but use pci_update_current_state() covering platform PM too for that. Link: https://lore.kernel.org/lkml/20210314000439.3138941-1-luzmaximilian@gmail.com/ Reported-by: Maximilian Luz <luzmaximilian@gmail.com> Tested-by: Maximilian Luz <luzmaximilian@gmail.com> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-05-22 10:40:16 +02:00
Tyrel Datwyler	ef8dc3d327	PCI: rpadlpar: Fix potential drc_name corruption in store functions commit cc7a0bb058b85ea03db87169c60c7cfdd5d34678 upstream. Both add_slot_store() and remove_slot_store() try to fix up the drc_name copied from the store buffer by placing a NUL terminator at nbyte + 1 or in place of a '\n' if present. However, the static buffer that we copy the drc_name data into is not zeroed and can contain anything past the n-th byte. This is problematic if a '\n' byte appears in that buffer after nbytes and the string copied into the store buffer was not NUL terminated to start with as the strchr() search for a '\n' byte will mark this incorrectly as the end of the drc_name string resulting in a drc_name string that contains garbage data after the n-th byte. Additionally it will cause us to overwrite that '\n' byte on the stack with NUL, potentially corrupting data on the stack. The following debugging shows an example of the drmgr utility writing "PHB 4543" to the add_slot sysfs attribute, but add_slot_store() logging a corrupted string value. drmgr: drmgr: -c phb -a -s PHB 4543 -d 1 add_slot_store: drc_name = PHB 4543°\|<82>!, rc = -19 Fix this by using strscpy() instead of memcpy() to ensure the string is NUL terminated when copied into the static drc_name buffer. Further, since the string is now NUL terminated the code only needs to change '\n' to '\0' when present. Cc: stable@vger.kernel.org Signed-off-by: Tyrel Datwyler <tyreld@linux.ibm.com> [mpe: Reformat change log and add mention of possible stack corruption] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20210315214821.452959-1-tyreld@linux.ibm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2021-03-24 10:59:24 +01:00
Martin Kaiser	a3641e1cb8	PCI: xgene-msi: Fix race in installing chained irq handler [ Upstream commit a93c00e5f975f23592895b7e83f35de2d36b7633 ] Fix a race where a pending interrupt could be received and the handler called before the handler's data has been setup, by converting to irq_set_chained_handler_and_data(). See also 2cf5a03cb29d ("PCI/keystone: Fix race in installing chained IRQ handler"). Based on the mail discussion, it seems ok to drop the error handling. Link: https://lore.kernel.org/r/20210115212435.19940-3-martin@kaiser.cx Signed-off-by: Martin Kaiser <martin@kaiser.cx> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-17 16:10:14 +01:00
Bjorn Helgaas	60fa1eaeb1	PCI: Add function 1 DMA alias quirk for Marvell 9215 SATA controller [ Upstream commit 059983790a4c963d92943e55a61fca55be427d55 ] Add function 1 DMA alias quirk for Marvell 88SS9215 PCIe SSD Controller. Link: https://bugzilla.kernel.org/show_bug.cgi?id=42679#c135 Link: https://lore.kernel.org/r/20201110220516.697934-1-helgaas@kernel.org Reported-by: John Smith <LK7S2ED64JHGLKj75shg9klejHWG49h5hk@protonmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-11 13:48:03 +01:00
Heiner Kallweit	5a442c5d07	PCI: Align checking of syscall user config accessors [ Upstream commit ef9e4005cbaf022c6251263aa27836acccaef65d ] After 34e3207205ef ("PCI: handle positive error codes"), pci_user_read_config_() and pci_user_write_config_() return 0 or negative errno values, not PCIBIOS_* values like PCIBIOS_SUCCESSFUL or PCIBIOS_BAD_REGISTER_NUMBER. Remove comparisons with PCIBIOS_SUCCESSFUL and check only for non-zero. It happens that PCIBIOS_SUCCESSFUL is zero, so this is not a functional change, but it aligns this code with the user accessors. [bhelgaas: commit log] Fixes: 34e3207205ef ("PCI: handle positive error codes") Link: https://lore.kernel.org/r/f1220314-e518-1e18-bf94-8e6f8c703758@gmail.com Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2021-03-03 17:44:39 +01:00
Jubin Zhong	0519bd382a	PCI: Fix pci_slot_release() NULL pointer dereference commit 4684709bf81a2d98152ed6b610e3d5c403f9bced upstream. If kobject_init_and_add() fails, pci_slot_release() is called to delete slot->list from parent->slots. But slot->list hasn't been initialized yet, so we dereference a NULL pointer: Unable to handle kernel NULL pointer dereference at virtual address 00000000 ... CPU: 10 PID: 1 Comm: swapper/0 Not tainted 4.4.240 #197 task: ffffeb398a45ef10 task.stack: ffffeb398a470000 PC is at __list_del_entry_valid+0x5c/0xb0 LR is at pci_slot_release+0x84/0xe4 ... __list_del_entry_valid+0x5c/0xb0 pci_slot_release+0x84/0xe4 kobject_put+0x184/0x1c4 pci_create_slot+0x17c/0x1b4 __pci_hp_initialize+0x68/0xa4 pciehp_probe+0x1a4/0x2fc pcie_port_probe_service+0x58/0x84 driver_probe_device+0x320/0x470 Initialize slot->list before calling kobject_init_and_add() to avoid this. Fixes: 8a94644b440e ("PCI: Fix pci_create_slot() reference count leak") Link: https://lore.kernel.org/r/1606876422-117457-1-git-send-email-zhongjubin@huawei.com Signed-off-by: Jubin Zhong <zhongjubin@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-12-29 13:45:07 +01:00
Rajat Jain	84a7342347	PCI: Add device even if driver attach failed commit 2194bc7c39610be7cabe7456c5f63a570604f015 upstream. device_attach() returning failure indicates a driver error while trying to probe the device. In such a scenario, the PCI device should still be added in the system and be visible to the user. When device_attach() fails, merely warn about it and keep the PCI device in the system. This partially reverts ab1a187bba5c ("PCI: Check device_attach() return value always"). Link: https://lore.kernel.org/r/20200706233240.3245512-1-rajatja@google.com Signed-off-by: Rajat Jain <rajatja@google.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: stable@vger.kernel.org # v4.6+ [sudip: use dev_warn] Signed-off-by: Sudip Mukherjee <sudipm.mukherjee@gmail.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-12-02 08:31:25 +01:00
Qiushi Wu	e840d01d94	PCI: Fix pci_create_slot() reference count leak [ Upstream commit 8a94644b440eef5a7b9c104ac8aa7a7f413e35e5 ] kobject_init_and_add() takes a reference even when it fails. If it returns an error, kobject_put() must be called to clean up the memory associated with the object. When kobject_init_and_add() fails, call kobject_put() instead of kfree(). b8eb718348b8 ("net-sysfs: Fix reference count leak in rx\|netdev_queue_add_kobject") fixed a similar problem. Link: https://lore.kernel.org/r/20200528021322.1984-1-wu000273@umn.edu Signed-off-by: Qiushi Wu <wu000273@umn.edu> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-09-03 11:21:17 +02:00
Rafael J. Wysocki	b6a8f34956	PCI: hotplug: ACPI: Fix context refcounting in acpiphp_grab_context() commit dae68d7fd4930315389117e9da35b763f12238f9 upstream. If context is not NULL in acpiphp_grab_context(), but the is_going_away flag is set for the device's parent, the reference counter of the context needs to be decremented before returning NULL or the context will never be freed, so make that happen. Fixes: edf5bf34d408 ("ACPI / dock: Use callback pointers from devices' ACPI hotplug contexts") Reported-by: Vasily Averin <vvs@virtuozzo.com> Cc: 3.15+ <stable@vger.kernel.org> # 3.15+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-08-21 11:02:07 +02:00
Xiongfeng Wang	345037fe42	PCI/ASPM: Add missing newline in sysfs 'policy' [ Upstream commit 3167e3d340c092fd47924bc4d23117a3074ef9a9 ] When I cat ASPM parameter 'policy' by sysfs, it displays as follows. Add a newline for easy reading. Other sysfs attributes already include a newline. [root@localhost ~]# cat /sys/module/pcie_aspm/parameters/policy [default] performance powersave powersupersave [root@localhost ~]# Fixes: 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") Link: https://lore.kernel.org/r/1594972765-10404-1-git-send-email-wangxiongfeng2@huawei.com Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-08-21 11:02:02 +02:00
Bjorn Helgaas	f5808a905d	PCI: Fix pci_cfg_wait queue locking problem [ Upstream commit 2a7e32d0547f41c5ce244f84cf5d6ca7fccee7eb ] The pci_cfg_wait queue is used to prevent user-space config accesses to devices while they are recovering from reset. Previously we used these operations on pci_cfg_wait: __add_wait_queue(&pci_cfg_wait, ...) __remove_wait_queue(&pci_cfg_wait, ...) wake_up_all(&pci_cfg_wait) The wake_up acquires the wait queue lock, but the add and remove do not. Originally these were all protected by the pci_lock, but cdcb33f98244 ("PCI: Avoid possible deadlock on pci_lock and p->pi_lock"), moved wake_up_all() outside pci_lock, so it could race with add/remove operations, which caused occasional kernel panics, e.g., during vfio-pci hotplug/unplug testing: Unable to handle kernel read from unreadable memory at virtual address ffff802dac469000 Resolve this by using wait_event() instead of __add_wait_queue() and __remove_wait_queue(). The wait queue lock is held by both wait_event() and wake_up_all(), so it provides mutual exclusion. Fixes: cdcb33f98244 ("PCI: Avoid possible deadlock on pci_lock and p->pi_lock") Link: https://lore.kernel.org/linux-pci/79827f2f-9b43-4411-1376-b9063b67aee3@huawei.com/T/#u Based-on: https://lore.kernel.org/linux-pci/20191210031527.40136-1-zhengxiang9@huawei.com/ Based-on-patch-by: Xiang Zheng <zhengxiang9@huawei.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Xiang Zheng <zhengxiang9@huawei.com> Cc: Heyi Guo <guoheyi@huawei.com> Cc: Biaoxiang Ye <yebiaoxiang@huawei.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-08-21 11:02:01 +02:00
Robert Hancock	2d9dacf235	PCI/ASPM: Disable ASPM on ASMedia ASM1083/1085 PCIe-to-PCI bridge commit b361663c5a40c8bc758b7f7f2239f7a192180e7c upstream. Recently ASPM handling was changed to allow ASPM on PCIe-to-PCI/PCI-X bridges. Unfortunately the ASMedia ASM1083/1085 PCIe to PCI bridge device doesn't seem to function properly with ASPM enabled. On an Asus PRIME H270-PRO motherboard, it causes errors like these: pcieport 0000:00:1c.0: AER: PCIe Bus Error: severity=Corrected, type=Data Link Layer, (Transmitter ID) pcieport 0000:00:1c.0: AER: device [8086:a292] error status/mask=00003000/00002000 pcieport 0000:00:1c.0: AER: [12] Timeout pcieport 0000:00:1c.0: AER: Corrected error received: 0000:00:1c.0 pcieport 0000:00:1c.0: AER: can't find device of ID00e0 In addition to flooding the kernel log, this also causes the machine to wake up immediately after suspend is initiated. The device advertises ASPM L0s and L1 support in the Link Capabilities register, but the ASMedia web page for ASM1083 [1] claims "No PCIe ASPM support". Windows 10 (build 2004) enables L0s, but it also logs correctable PCIe errors. Add a quirk to disable ASPM for this device. [1] https://www.asmedia.com.tw/eng/e_show_products.php?cate_index=169&item=114 [bhelgaas: commit log] Fixes: 66ff14e59e8a ("PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges") Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=208667 Link: https://lore.kernel.org/r/20200722021803.17958-1-hancockrwd@gmail.com Signed-off-by: Robert Hancock <hancockrwd@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-08-21 11:01:48 +02:00
Bjorn Helgaas	3facf1adb9	PCI/PTM: Inherit Switch Downstream Port PTM settings from Upstream Port [ Upstream commit 7b38fd9760f51cc83d80eed2cfbde8b5ead9e93a ] Except for Endpoints, we enable PTM at enumeration-time. Previously we did not account for the fact that Switch Downstream Ports are not permitted to have a PTM capability; their PTM behavior is controlled by the Upstream Port (PCIe r5.0, sec 7.9.16). Since Downstream Ports don't have a PTM capability, we did not mark them as "ptm_enabled", which meant that pci_enable_ptm() on an Endpoint failed because there was no PTM path to it. Mark Downstream Ports as "ptm_enabled" if their Upstream Port has PTM enabled. Fixes: eec097d43100 ("PCI: Add pci_enable_ptm() for drivers to enable PTM on endpoints") Reported-by: Aditya Paluri <Venkata.AdityaPaluri@synopsys.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-30 15:38:26 -04:00
Kai-Heng Feng	e0596b2c96	PCI/ASPM: Allow ASPM on links to PCIe-to-PCI/PCI-X Bridges [ Upstream commit 66ff14e59e8a30690755b08bc3042359703fb07a ] 7d715a6c1ae5 ("PCI: add PCI Express ASPM support") added the ability for Linux to enable ASPM, but for some undocumented reason, it didn't enable ASPM on links where the downstream component is a PCIe-to-PCI/PCI-X Bridge. Remove this exclusion so we can enable ASPM on these links. The Dell OptiPlex 7080 mentioned in the bugzilla has a TI XIO2001 PCIe-to-PCI Bridge. Enabling ASPM on the link leading to it allows the Intel SoC to enter deeper Package C-states, which is a significant power savings. [bhelgaas: commit log] Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=207571 Link: https://lore.kernel.org/r/20200505173423.26968-1-kai.heng.feng@canonical.com Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-30 15:38:24 -04:00
Andrew Murray	f84bd31331	PCI: rcar: Fix incorrect programming of OB windows [ Upstream commit 2b9f217433e31d125fb697ca7974d3de3ecc3e92 ] The outbound windows (PCIEPAUR(x), PCIEPALR(x)) describe a mapping between a CPU address (which is determined by the window number 'x') and a programmed PCI address - Thus allowing the controller to translate CPU accesses into PCI accesses. However the existing code incorrectly writes the CPU address - lets fix this by writing the PCI address instead. For memory transactions, existing DT users describe a 1:1 identity mapping and thus this change should have no effect. However the same isn't true for I/O. Link: https://lore.kernel.org/r/20191004132941.6660-1-andrew.murray@arm.com Fixes: c25da4778803 ("PCI: rcar: Add Renesas R-Car PCIe driver") Tested-by: Marek Vasut <marek.vasut+renesas@gmail.com> Signed-off-by: Andrew Murray <andrew.murray@arm.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Marek Vasut <marek.vasut+renesas@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-30 15:38:24 -04:00
Pali Rohár	9a91d5256e	PCI: aardvark: Don't blindly enable ASPM L0s and don't write to read-only register [ Upstream commit 90c6cb4a355e7befcb557d217d1d8b8bd5875a05 ] Trying to change Link Status register does not have any effect as this is a read-only register. Trying to overwrite bits for Negotiated Link Width does not make sense. In future proper change of link width can be done via Lane Count Select bits in PCIe Control 0 register. Trying to unconditionally enable ASPM L0s via ASPM Control bits in Link Control register is wrong. There should be at least some detection if endpoint supports L0s as isn't mandatory. Moreover ASPM Control bits in Link Control register are controlled by pcie/aspm.c code which sets it according to system ASPM settings, immediately after aardvark driver probes. So setting these bits by aardvark driver has no long running effect. Remove code which touches ASPM L0s bits from this driver and let kernel's ASPM implementation to set ASPM state properly. Some users are reporting issues that this code is problematic for some Intel wifi cards and removing it fixes them, see e.g.: https://bugzilla.kernel.org/show_bug.cgi?id=196339 If problems with Intel wifi cards occur even after this commit, then pcie/aspm.c code could be modified / hooked to not enable ASPM L0s state for affected problematic cards. Link: https://lore.kernel.org/r/20200430080625.26070-3-pali@kernel.org Tested-by: Tomasz Maciej Nowak <tmn505@gmail.com> Signed-off-by: Pali Rohár <pali@kernel.org> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Rob Herring <robh@kernel.org> Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-30 15:38:21 -04:00
Ashok Raj	08727720bb	PCI: Program MPS for RCiEP devices commit aa0ce96d72dd2e1b0dfd0fb868f82876e7790878 upstream. Root Complex Integrated Endpoints (RCiEPs) do not have an upstream bridge, so pci_configure_mps() previously ignored them, which may result in reduced performance. Instead, program the Max_Payload_Size of RCiEPs to the maximum supported value (unless it is limited for the PCIE_BUS_PEER2PEER case). This also affects the subsequent programming of Max_Read_Request_Size because Linux programs MRRS based on the MPS value. Fixes: 9dae3a97297f ("PCI: Move MPS configuration check to pci_configure_device()") Link: https://lore.kernel.org/r/1585343775-4019-1-git-send-email-ashok.raj@intel.com Tested-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Ashok Raj <ashok.raj@intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-06-20 10:24:20 +02:00
Jiaxun Yang	97aad0ad75	PCI: Don't disable decoding when mmio_always_on is set [ Upstream commit b6caa1d8c80cb71b6162cb1f1ec13aa655026c9f ] Don't disable MEM/IO decoding when a device have both non_compliant_bars and mmio_always_on. That would allow us quirk devices with junk in BARs but can't disable their decoding. Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com> Acked-by: Bjorn Helgaas <helgaas@kernel.org> Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-06-20 10:24:18 +02:00
Heiner Kallweit	947a17f242	PCI/ASPM: Allow re-enabling Clock PM [ Upstream commit 35efea32b26f9aacc99bf07e0d2cdfba2028b099 ] Previously Clock PM could not be re-enabled after being disabled by pci_disable_link_state() because clkpm_capable was reset. Change this by adding a clkpm_disable field similar to aspm_disable. Link: https://lore.kernel.org/r/4e8a66db-7d53-4a66-c26c-f0037ffaa705@gmail.com Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-05-02 17:23:07 +02:00
Navid Emamdoost	4c2ddfa6a5	PCI/IOV: Fix memory leak in pci_iov_add_virtfn() [ Upstream commit 8c386cc817878588195dde38e919aa6ba9409d58 ] In the implementation of pci_iov_add_virtfn() the allocated virtfn is leaked if pci_setup_device() fails. The error handling is not calling pci_stop_and_remove_bus_device(). Change the goto label to failed2. Fixes: 156c55325d30 ("PCI: Check for pci_setup_device() failure in pci_iov_add_virtfn()") Link: https://lore.kernel.org/r/20191125195255.23740-1-navid.emamdoost@gmail.com Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-28 15:42:23 +01:00
Logan Gunthorpe	208ac81366	PCI: Don't disable bridge BARs when assigning bus resources commit 9db8dc6d0785225c42a37be7b44d1b07b31b8957 upstream. Some PCI bridges implement BARs in addition to bridge windows. For example, here's a PLX switch: 04:00.0 PCI bridge: PLX Technology, Inc. PEX 8724 24-Lane, 6-Port PCI Express Gen 3 (8 GT/s) Switch, 19 x 19mm FCBGA (rev ca) (prog-if 00 [Normal decode]) Flags: bus master, fast devsel, latency 0, IRQ 30, NUMA node 0 Memory at 90a00000 (32-bit, non-prefetchable) [size=256K] Bus: primary=04, secondary=05, subordinate=0a, sec-latency=0 I/O behind bridge: 00002000-00003fff Memory behind bridge: 90000000-909fffff Prefetchable memory behind bridge: 0000380000800000-0000380000bfffff Previously, when the kernel assigned resource addresses (with the pci=realloc command line parameter, for example) it could clear the struct resource corresponding to the BAR. When this happened, lspci would report this BAR as "ignored": Region 0: Memory at <ignored> (32-bit, non-prefetchable) [size=256K] This is because the kernel reports a zero start address and zero flags in the corresponding sysfs resource file and in /proc/bus/pci/devices. Investigation with 'lspci -x', however, shows the BIOS-assigned address will still be programmed in the device's BAR registers. It's clearly a bug that the kernel lost track of the BAR value, but in most cases, this still won't result in a visible issue because nothing uses the memory, so nothing is affected. However, when an IOMMU is in use, it will not reserve this space in the IOVA because the kernel no longer thinks the range is valid. (See dmar_init_reserved_ranges() for the Intel implementation of this.) Without the proper reserved range, a DMA mapping may allocate an IOVA that matches a bridge BAR, which results in DMA accesses going to the BAR instead of the intended RAM. The problem was in pci_assign_unassigned_root_bus_resources(). When any resource from a bridge device fails to get assigned, the code set the resource's flags to zero. This makes sense for bridge windows, as they will be re-enabled later, but for regular BARs, it makes the kernel permanently lose track of the fact that they decode address space. Change pci_assign_unassigned_root_bus_resources() and pci_assign_unassigned_bridge_resources() so they only clear "res->flags" for bridge windows, not bridge BARs. Fixes: da7822e5ad71 ("PCI: update bridge resources to get more big ranges when allocating space (again)") Link: https://lore.kernel.org/r/20200108213208.4612-1-logang@deltatee.com [bhelgaas: commit log, check for pci_is_bridge()] Reported-by: Kit Chow <kchow@gigaio.com> Signed-off-by: Logan Gunthorpe <logang@deltatee.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-02-14 16:31:09 -05:00
Yurii Monakov	1eeb10c067	PCI: keystone: Fix link training retries initiation [ Upstream commit 6df19872d881641e6394f93ef2938cffcbdae5bb ] ks_pcie_stop_link() function does not clear LTSSM_EN_VAL bit so link training was not triggered more than once after startup. In configurations where link can be unstable during early boot, for example, under low temperature, it will never be established. Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver") Signed-off-by: Yurii Monakov <monakov.y@gmail.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Andrew Murray <andrew.murray@arm.com> Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>	2020-02-14 16:31:00 -05:00
Bjorn Helgaas	c21e3d1e1c	PCI/PTM: Remove spurious "d" from granularity message commit 127a7709495db52a41012deaebbb7afc231dad91 upstream. The granularity message has an extra "d": pci 0000:02:00.0: PTM enabled, 4dns granularity Remove the "d" so the message is simply "PTM enabled, 4ns granularity". Fixes: 8b2ec318eece ("PCI: Add PTM clock granularity information") Link: https://lore.kernel.org/r/20191106222420.10216-2-helgaas@kernel.org Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com> Cc: Jonathan Yong <jonathan.yong@intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2020-01-23 08:19:36 +01:00
Jian-Hong Pan	e3745fabfe	PCI/MSI: Fix incorrect MSI-X masking on resume commit e045fa29e89383c717e308609edd19d2fd29e1be upstream. When a driver enables MSI-X, msix_program_entries() reads the MSI-X Vector Control register for each vector and saves it in desc->masked. Each register is 32 bits and bit 0 is the actual Mask bit. When we restored these registers during resume, we previously set the Mask bit if any bit in desc->masked was set instead of when the Mask bit itself was set: pci_restore_state pci_restore_msi_state __pci_restore_msix_state for_each_pci_msi_entry msix_mask_irq(entry, entry->masked) <-- entire u32 word __pci_msix_desc_mask_irq(desc, flag) mask_bits = desc->masked & ~PCI_MSIX_ENTRY_CTRL_MASKBIT if (flag) <-- testing entire u32, not just bit 0 mask_bits \|= PCI_MSIX_ENTRY_CTRL_MASKBIT writel(mask_bits, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL) This means that after resume, MSI-X vectors were masked when they shouldn't be, which leads to timeouts like this: nvme nvme0: I/O 978 QID 3 timeout, completion polled On resume, set the Mask bit only when the saved Mask bit from suspend was set. This should remove the need for 19ea025e1d28 ("nvme: Add quirk for Kingston NVME SSD running FW E8FK11.T"). [bhelgaas: commit log, move fix to __pci_msix_desc_mask_irq()] Link: https://bugzilla.kernel.org/show_bug.cgi?id=204887 Link: https://lore.kernel.org/r/20191008034238.2503-1-jian-hong@endlessm.com Fixes: f2440d9acbe8 ("PCI MSI: Refactor interrupt masking code") Signed-off-by: Jian-Hong Pan <jian-hong@endlessm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 10:42:29 +01:00
Steffen Liebergeld	0d78b18ea4	PCI: Fix Intel ACS quirk UPDCR register address commit d8558ac8c93d429d65d7490b512a3a67e559d0d4 upstream. According to documentation [0] the correct offset for the Upstream Peer Decode Configuration Register (UPDCR) is 0x1014. It was previously defined as 0x1114. d99321b63b1f ("PCI: Enable quirks for PCIe ACS on Intel PCH root ports") intended to enforce isolation between PCI devices allowing them to be put into separate IOMMU groups. Due to the wrong register offset the intended isolation was not fully enforced. This is fixed with this patch. Please note that I did not test this patch because I have no hardware that implements this register. [0] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/4th-gen-core-family-mobile-i-o-datasheet.pdf (page 325) Fixes: d99321b63b1f ("PCI: Enable quirks for PCIe ACS on Intel PCH root ports") Link: https://lore.kernel.org/r/7a3505df-79ba-8a28-464c-88b83eefffa6@kernkonzept.com Signed-off-by: Steffen Liebergeld <steffen.liebergeld@kernkonzept.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Andrew Murray <andrew.murray@arm.com> Acked-by: Ashok Raj <ashok.raj@intel.com> Cc: stable@vger.kernel.org # v3.15+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-12-21 10:42:28 +01:00
Kishon Vijay Abraham I	61022359b3	PCI: keystone: Use quirk to limit MRRS for K2G [ Upstream commit 148e340c0696369fadbbddc8f4bef801ed247d71 ] PCI controller in K2G also has a limitation that memory read request size (MRRS) must not exceed 256 bytes. Use the quirk to limit MRRS (added for K2HK, K2L and K2E) for K2G as well. Signed-off-by: Kishon Vijay Abraham I <kishon@ti.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-11-28 18:29:01 +01:00
Vidya Sagar	f7f886b51b	PCI: tegra: Enable Relaxed Ordering only for Tegra20 & Tegra30 commit 7be142caabc4780b13a522c485abc806de5c4114 upstream. The PCI Tegra controller conversion to a device tree configurable driver in commit d1523b52bff3 ("PCI: tegra: Move PCIe driver to drivers/pci/host") implied that code for the driver can be compiled in for a kernel supporting multiple platforms. Unfortunately, a blind move of the code did not check that some of the quirks that were applied in arch/arm (eg enabling Relaxed Ordering on all PCI devices - since the quirk hook erroneously matches PCI_ANY_ID for both Vendor-ID and Device-ID) are now applied in all kernels that compile the PCI Tegra controlled driver, DT and ACPI alike. This is completely wrong, in that enablement of Relaxed Ordering is only required by default in Tegra20 platforms as described in the Tegra20 Technical Reference Manual (available at https://developer.nvidia.com/embedded/downloads#?search=tegra%202 in Section 34.1, where it is mentioned that Relaxed Ordering bit needs to be enabled in its root ports to avoid deadlock in hardware) and in the Tegra30 platforms for the same reasons (unfortunately not documented in the TRM). There is no other strict requirement on PCI devices Relaxed Ordering enablement on any other Tegra platforms or PCI host bridge driver. Fix this quite upsetting situation by limiting the vendor and device IDs to which the Relaxed Ordering quirk applies to the root ports in question, reported above. Signed-off-by: Vidya Sagar <vidyas@nvidia.com> [lorenzo.pieralisi@arm.com: completely rewrote the commit log/fixes tag] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Thierry Reding <treding@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-11-12 19:15:54 +01:00
Rafael J. Wysocki	1ee3376d77	PCI: PM: Fix pci_power_up() commit 45144d42f299455911cc29366656c7324a3a7c97 upstream. There is an arbitrary difference between the system resume and runtime resume code paths for PCI devices regarding the delay to apply when switching the devices from D3cold to D0. Namely, pci_restore_standard_config() used in the runtime resume code path calls pci_set_power_state() which in turn invokes __pci_start_power_transition() to power up the device through the platform firmware and that function applies the transition delay (as per PCI Express Base Specification Revision 2.0, Section 6.6.1). However, pci_pm_default_resume_early() used in the system resume code path calls pci_power_up() which doesn't apply the delay at all and that causes issues to occur during resume from suspend-to-idle on some systems where the delay is required. Since there is no reason for that difference to exist, modify pci_power_up() to follow pci_set_power_state() more closely and invoke __pci_start_power_transition() from there to call the platform firmware to power up the device (in case that's necessary). Fixes: db288c9c5f9d ("PCI / PM: restore the original behavior of pci_set_power_state()") Reported-by: Daniel Drake <drake@endlessm.com> Tested-by: Daniel Drake <drake@endlessm.com> Link: https://lore.kernel.org/linux-pm/CAD8Lp44TYxrMgPLkHCqF9hv6smEurMXvmmvmtyFhZ6Q4SE+dig@mail.gmail.com/T/#m21be74af263c6a34f36e0fc5c77c5449d9406925 Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: 3.10+ <stable@vger.kernel.org> # 3.10+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2019-10-29 09:15:26 +01:00
Nishka Dasgupta	c0c2a1ad68	PCI: tegra: Fix OF node reference leak [ Upstream commit 9e38e690ace3e7a22a81fc02652fc101efb340cf ] Each iteration of for_each_child_of_node() executes of_node_put() on the previous node, but in some return paths in the middle of the loop of_node_put() is missing thus causing a reference leak. Hence stash these mid-loop return values in a variable 'err' and add a new label err_node_put which executes of_node_put() on the previous node and returns 'err' on failure. Change mid-loop return statements to point to jump to this label to fix the reference leak. Issue found with Coccinelle. Signed-off-by: Nishka Dasgupta <nishkadg.linux@gmail.com> [lorenzo.pieralisi@arm.com: rewrote commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-10-07 18:53:16 +02:00
Bharat Kumar Gogada	5086e479e2	PCI: xilinx-nwl: Fix Multi MSI data programming [ Upstream commit 181fa434d0514e40ebf6e9721f2b72700287b6e2 ] According to the PCI Local Bus specification Revision 3.0, section 6.8.1.3 (Message Control for MSI), endpoints that are Multiple Message Capable as defined by bits [3:1] in the Message Control for MSI can request a number of vectors that is power of two aligned. As specified in section 6.8.1.6 "Message data for MSI", the Multiple Message Enable field (bits [6:4] of the Message Control register) defines the number of low order message data bits the function is permitted to modify to generate its system software allocated vectors. The MSI controller in the Xilinx NWL PCIe controller supports a number of MSI vectors specified through a bitmap and the hwirq number for an MSI, that is the value written in the MSI data TLP is determined by the bitmap allocation. For instance, in a situation where two endpoints sitting on the PCI bus request the following MSI configuration, with the current PCI Xilinx bitmap allocation code (that does not align MSI vector allocation on a power of two boundary): Endpoint #1: Requesting 1 MSI vector - allocated bitmap bits 0 Endpoint #2: Requesting 2 MSI vectors - allocated bitmap bits [1,2] The bitmap value(s) corresponds to the hwirq number that is programmed into the Message Data for MSI field in the endpoint MSI capability and is detected by the root complex to fire the corresponding MSI irqs. The value written in Message Data for MSI field corresponds to the first bit allocated in the bitmap for Multi MSI vectors. The current Xilinx NWL MSI allocation code allows a bitmap allocation that is not a power of two boundaries, so endpoint #2, is allowed to toggle Message Data bit[0] to differentiate between its two vectors (meaning that the MSI data will be respectively 0x0 and 0x1 for the two vectors allocated to endpoint #2). This clearly aliases with the Endpoint #1 vector allocation, resulting in a broken Multi MSI implementation. Update the code to allocate MSI bitmap ranges with a power of two alignment, fixing the bug. Fixes: ab597d35ef11 ("PCI: xilinx-nwl: Add support for Xilinx NWL PCIe Host Controller") Suggested-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Bharat Kumar Gogada <bharat.kumar.gogada@xilinx.com> [lorenzo.pieralisi@arm.com: updated commit log] Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Marc Zyngier <marc.zyngier@arm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>	2019-08-04 09:33:38 +02:00

1 2 3 4 5 ...

5307 Commits