10868 Commits

Author SHA1 Message Date
Rick Wertenbroek
2dba285cab PCI: rockchip-ep: Remove wrong mask on subsys_vendor_id
Remove wrong mask on subsys_vendor_id. Both the Vendor ID and Subsystem
Vendor ID are u16 variables and are written to a u32 register of the
controller. The Subsystem Vendor ID was always 0 because the u16 value
was masked incorrectly with GENMASK(31,16) resulting in all lower 16
bits being set to 0 prior to the shift.

Remove both masks as they are unnecessary and set the register correctly
i.e., the lower 16-bits are the Vendor ID and the upper 16-bits are the
Subsystem Vendor ID.

This is documented in the RK3399 TRM section 17.6.7.1.17

[kwilczynski: removed unnecesary newline]
Fixes: cf590b078391 ("PCI: rockchip: Add EP driver for Rockchip PCIe controller")
Link: https://lore.kernel.org/linux-pci/20240403144508.489835-1-rick.wertenbroek@gmail.com
Signed-off-by: Rick Wertenbroek <rick.wertenbroek@gmail.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: stable@vger.kernel.org
2024-05-15 16:07:30 -05:00
Ilpo Järvinen
fe4a83ec07 PCI: Make pcie_bandwidth_capable() static
pcie_bandwidth_capable() is only used within pci.c, make it static.

Link: https://lore.kernel.org/r/20240507121758.13849-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-05-08 19:03:55 -05:00
Kuppuswamy Sathyanarayanan
e2e78a294a PCI/EDR: Align EDR_PORT_LOCATE_DSM with PCI Firmware r3.3
The "Downstream Port Containment related Enhancements" ECN of Jan 28, 2019
(document 12888 below), defined the EDR_PORT_LOCATE_DSM function with
Revision ID 5 with a return value encoding (Bits 2:0 = Function, Bits 7:3 =
Device, Bits 15:8 = Bus).  When the ECN was integrated into PCI Firmware
r3.3, sec 4.6.13, Bit 31 was added to indicate success or failure.

Check Bit 31 for failure in acpi_dpc_port_get().

Link: https://lore.kernel.org/r/20240501022543.1626025-1-sathyanarayanan.kuppuswamy@linux.intel.com
Link: https://members.pcisig.com/wg/PCI-SIG/document/12888
Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
[bhelgaas: split into two patches, update commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Satish Thatchanamurthy <Satish.Thatchanamurt@Dell.com> # one platform
2024-05-08 15:08:39 -05:00
Kuppuswamy Sathyanarayanan
f24ba84613 PCI/EDR: Align EDR_PORT_DPC_ENABLE_DSM with PCI Firmware r3.3
The "Downstream Port Containment related Enhancements" ECN of Jan 28, 2019
(document 12888 below), defined the EDR_PORT_DPC_ENABLE_DSM function with
Revision ID 5 with Arg3 being an integer.  But when the ECN was integrated
into PCI Firmware r3.3, sec 4.6.12, it was defined as Revision ID 6 with
Arg3 being a package containing an integer.

The implementation in acpi_enable_dpc() supplies a package as Arg3 (arg4 in
the code), but it previously specified Revision ID 5.  Align this with PCI
Firmware r3.3 by using Revision ID 6.

If firmware implemented per the ECN, its Revision 5 function would receive
a package as Arg3 when it expects an integer, so acpi_enable_dpc() would
likely fail.  If such firmware exists and lacks a Revision 6 function that
expects a package, we may have to add support for Revision 5.

Link: https://lore.kernel.org/r/20240501022543.1626025-1-sathyanarayanan.kuppuswamy@linux.intel.com
Link: https://members.pcisig.com/wg/PCI-SIG/document/12888
Fixes: ac1c8e35a326 ("PCI/DPC: Add Error Disconnect Recover (EDR) support")
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
[bhelgaas: split into two patches, update commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Satish Thatchanamurthy <Satish.Thatchanamurt@Dell.com> # one platform
2024-05-08 14:32:08 -05:00
Dave Jiang
53c49b6e6d PCI/CXL: Add 'cxl_bus' reset method for devices below CXL Ports
By default Secondary Bus Reset (SBR) is masked for CXL Ports (see CXL r3.1,
sec 8.1.5.2).

Add cxl_reset_bus_function() (method "cxl_bus") to set the "Unmask SBR" bit
in the upstream CXL Port before performing the bus reset and restore the
original value afterwards.

This method allows the user to perform a bus reset on a CXL device without
needing to set the "Unmask SBR" bit via a user tool.

Link: https://lore.kernel.org/r/20240502165851.1948523-5-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[bhelgaas: simplify commit log, invert condition to avoid negation]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
2024-05-08 13:25:43 -05:00
Dave Jiang
b1956e2d07 PCI/CXL: Fail bus reset if upstream CXL Port has SBR masked
Per CXL spec r3.1, sec 8.1.5.2, the Secondary Bus Reset (SBR) bit in the
Bridge Control register of a CXL port has no effect unless the "Unmask SBR"
bit is set.

Return -ENOTTY if we attempt a bus reset on a device below a CXL Port where
"Unmask SBR" is 0.  Otherwise, the bus reset would appear to have succeeded
even though setting the bridge SBR bit had no effect.

Link: https://lore.kernel.org/linux-cxl/20240220203956.GA1502351@bhelgaas/
Link: https://lore.kernel.org/r/20240502165851.1948523-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[bhelgaas: simplify commit log and comments]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
2024-05-08 13:25:36 -05:00
Dave Jiang
7e89efc6e9 PCI: Lock upstream bridge for pci_reset_function()
Fix a long-standing locking gap for missing pci_cfg_access_lock() while
manipulating bridge reset registers and configuration during
pci_reset_bus_function().

If there is an upstream bridge, lock it before locking the device itself.
pci_dev_lock() calls pci_cfg_access_lock(), which blocks the writing of PCI
config space by user space.

Add lockdep assertion via pci_dev->cfg_access_lock to verify
pci_dev->block_cfg_access is set.

Co-developed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://lore.kernel.org/r/20240502165851.1948523-3-dave.jiang@intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-05-08 13:19:20 -05:00
Linus Torvalds
1ab1a19db1 pci-v6.9-fixes-2
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmY61iEUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vwtwA//Zw8a27/+cHuciYCOYMIrjhucBUCc
 qHBdzDWTy+h3gkfbcRFfXs3XaIBhlGbtI1d0GG5FyMuqicxCsF/mCIyc2LSTMIUo
 4201qVl/EGrNIBhOVcZtK+CFQmwmw1AaBdz7q4dS4/549xXGQ+/8DibAjfUlcDgC
 2iIkcvfNW9Hj9n4tFezNSPLewGVgFY2yFpImLHZc2hAuSXQ0P0D9JEDUUVVIWg/c
 PSJQKKita/fxgKk8RRCTRdpVezAtd7QO8V4Ae5gGH+oho4nRvCO0kYGteglx7/ab
 ReNtfNUPJN9h7M5ZYpyiNp1aZTaMEp3P+gMsD9ohV0/+5MNNAiZhDLPguQaEy/2n
 ZiQh5K3vwQb2NStJXauiBqJ+NHeqf8m3mk76X3/hxma6wqDfEOsRvYaexwY+Wxfa
 I0tzjZF1LBepsoFyDJM/5S+3nCJoqaCUAy1ZbGXwsBAAzZHw6x9+ieJfJhnCOL96
 kkNiNlxs8OJTTMl6F8W88NvMnhmCF0JxSOVTfxTaVwCaD6GwpnMNpXSEXqXPlVL1
 jMRHr/hZ7JjHarELC/TGe1uUmPsBhIym862XV+7E+9uUbcWldwjBijuSZQ9zUpX2
 UL0Cc2gJzyh/GwpeDVCyBGaINxzNVq3D5H6rYHlWQP+dp59stt/UBqBWJmerTHjy
 QQ+9+XS/CiElUL8=
 =EQN9
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull pci fixes from Bjorn Helgaas:

 - Update kernel-parameters doc to describe "pcie_aspm=off" more
   accurately (Bjorn Helgaas)

 - Restore the parent's (not the child's) ASPM state to the parent
   during resume, which fixes a reboot during resume (Kai-Heng Feng)

* tag 'pci-v6.9-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/ASPM: Restore parent state to parent, child state to child
  PCI/ASPM: Clarify that pcie_aspm=off means leave ASPM untouched
2024-05-08 09:37:58 -07:00
Kai-Heng Feng
f3d049b35b PCI/ASPM: Restore parent state to parent, child state to child
There's a typo that makes parent device uses child LNKCTL value and vice
versa. This causes Micron NVMe to trigger a reboot upon system resume.

Correct the typo to fix the issue.

Fixes: 64dbb2d70744 ("PCI/ASPM: Disable L1 before configuring L1 Substates")
Link: https://lore.kernel.org/r/20240506051602.1990743-1-kai.heng.feng@canonical.com
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
[bhelgaas: update subject]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
2024-05-06 14:12:40 -05:00
Nam Cao
b023c1c97f PCI: hotplug: Remove obsolete sgi_hotplug TODO notes
Commit c7532b601e77 ("PCI/hotplug: remove the sgi_hotplug driver") deleted
the driver.

Remove the remaining TODO notes as well.

Link: https://lore.kernel.org/r/26784ee39fbb3fbd0fe96508158d74419018e6ad.1714762038.git.namcao@linutronix.de
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-05-03 16:26:50 -05:00
Nam Cao
e6e9a9a455 PCI: hotplug: Document unchecked return value of pci_hp_add_bridge()
Some hotplug drivers do not check the return value of pci_hp_add_bridge().
This may be problematic if the driver proceeds after pci_hp_add_bridge()
fails.

Link: https://lore.kernel.org/r/16a2442ea6ee896987a44df3ed509e4cfde44475.1714762038.git.namcao@linutronix.de
Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-05-03 16:25:52 -05:00
Duoming Zhou
e6f7d27df5 PCI: of_property: Return error for int_map allocation failure
Return -ENOMEM from of_pci_prop_intr_map() if kcalloc() fails to prevent a
NULL pointer dereference in this case.

Fixes: 407d1a51921e ("PCI: Create device tree node for bridge")
Link: https://lore.kernel.org/r/20240303105729.78624-1-duoming@zju.edu.cn
Signed-off-by: Duoming Zhou <duoming@zju.edu.cn>
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-05-02 17:15:01 -05:00
Ilpo Järvinen
dc69062a1a PCI/ASPM: Clean up ASPM disable/enable mask calculation
With only one set of defines remaining, state can be almost used as-is to
set ->aspm_disable/default. Only CLKPM and L1 PM substates need special
handling.

Remove unnecessary if conditions that can use the state variable bits
directly. Move ASPM mask calculation into pci_calc_aspm_enable_mask() and
pci_calc_aspm_disable_mask() helpers which makes it easier to alter state
variable directly.

Link: https://lore.kernel.org/r/20240322123952.6384-3-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-05-02 17:02:19 -05:00
Ilpo Järvinen
b478e162f2 PCI/ASPM: Consolidate link state defines
The linux/pci.h and aspm.c files define their own sets of link state
related defines which are almost the same.

Consolidate the use of defines into those defined by linux/pci.h and expand
PCIE_LINK_STATE_L0S to match earlier ASPM_STATE_L0S that includes both
upstream and downstream bits. Rename also the defines that are internal to
aspm.c to start with PCIE_LINK_STATE for consistency.

While the PCIE_LINK_STATE_L0S BIT(0) -> (BIT(0) | BIT(1)) transformation is
not 1:1, in practice aspm.c already used ASPM_STATE_L0S that has both bits
enabled except during mapping.

While at it, place the PCIE_LINK_STATE_CLKPM define last to have more
logical grouping.

Use static_assert() to ensure PCIE_LINK_STATE_L0S is strictly equal to the
combination of PCIE_LINK_STATE_L0S_UP/DW.

Link: https://lore.kernel.org/r/20240322123952.6384-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-05-02 16:58:11 -05:00
Ilpo Järvinen
4c407392c1 PCI: Clean up accessor macro formatting
Clean up formatting of PCI accessor macros:

  - Put return statements on own line

  - Add a few empty lines for better readability

  - Align macro continuation backslashes

  - Correct function call argument indentation level

  - Reorder variable declarations to order of use

  - Drop unnecessary variable initialization

Link: https://lore.kernel.org/r/20240429094707.2529-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
[bhelgaas: drop initialization, tweak variables to order of use]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-29 10:43:58 -05:00
Ilpo Järvinen
c9758cc45c PCI/ERR: Cleanup misleading indentation inside if conditions
A few if conditions align misleadingly with the following code block.  The
checks are really cascading NULL checks that fit into 80 chars so remove
newlines in between and realign to the if condition indent.

Link: https://lore.kernel.org/r/20240429094707.2529-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-29 10:33:29 -05:00
Andy Shevchenko
0ba5cd94bb PCI/MSI: Make error path handling follow the standard pattern
Make error path handling follow the standard pattern, i.e.  checking for
errors first. This makes code much easier to read and understand despite
being a bit longer.

Link: https://lore.kernel.org/r/20240426144039.557907-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-26 17:04:20 -05:00
Damien Le Moal
a665b0a93f PCI/portdrv: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
In the PCI Express Port Bus Driver, use the macro PCI_IRQ_INTX instead of
the now deprecated PCI_IRQ_LEGACY macro.

Link: https://lore.kernel.org/r/20240325070944.3600338-3-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-25 12:53:29 -05:00
Damien Le Moal
bb14a6a870 PCI/MSI: Use PCI_IRQ_INTX instead of PCI_IRQ_LEGACY
In pci_alloc_irq_vectors_affinity(), use the macro PCI_IRQ_INTX instead of
the now deprecated PCI_IRQ_LEGACY macro.

Link: https://lore.kernel.org/r/20240325070944.3600338-2-dlemoal@kernel.org
Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-25 12:53:29 -05:00
Ilpo Järvinen
cdc6c4abcb PCI: Clarify intent of LT wait
Clarify the comment relating to the LT wait and the purpose of the check
that implements the implementation note in PCIe r6.1 sec 7.5.3.7.

Suggested-by: Maciej W. Rozycki <macro@orcam.me.uk>
Link: https://lore.kernel.org/r/20240423130820.43824-2-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-25 12:53:29 -05:00
Ilpo Järvinen
73cb3a35f9 PCI: Wait for Link Training==0 before starting Link retrain
Two changes were made in link retraining logic independent of each other.

The commit e7e39756363a ("PCI/ASPM: Avoid link retraining race") added a
check to pcie_retrain_link() to ensure no Link Training is currently active
to address the Implementation Note in PCIe r6.1 sec 7.5.3.7. At that time
pcie_wait_for_retrain() only checked for the Link Training (LT) bit being
cleared.

The commit 680e9c47a229 ("PCI: Add support for polling DLLLA to
pcie_retrain_link()") generalized pcie_wait_for_retrain() into
pcie_wait_for_link_status() which can wait either for LT or the Data Link
Layer Link Active (DLLLA) bit with 'use_lt' argument and supporting waiting
for either cleared or set using 'active' argument.

In the merge commit 1abb47390350 ("Merge branch 'pci/enumeration'"), those
two divergent branches converged. The merge changed LT bit checking added
in the commit e7e39756363a ("PCI/ASPM: Avoid link retraining race") to now
wait for completion of any ongoing Link Training using DLLLA bit being set
if 'use_lt' is false.

When 'use_lt' is false, the pseudo-code steps of what occurs in
pcie_retrain_link():

	1. Wait for DLLLA==1
	2. Trigger link to retrain
	3. Wait for DLLLA==1

Step 3 waits for the link to come up from the retraining triggered by Step
2. As Step 1 is supposed to wait for any ongoing retraining to end, using
DLLLA also for it does not make sense because link training being active is
still indicated using LT bit, not with DLLLA.

Correct the pcie_wait_for_link_status() parameters in Step 1 to only wait
for LT==0 to ensure there is no ongoing Link Training.

This only impacts the Target Speed quirk, which is the only case where
waiting for DLLLA bit is used. It currently works in the problematic case
by means of link training getting initiated by hardware repeatedly and
respecting the new link parameters set by the caller, which then make
training succeed and bring the link up, setting DLLLA and causing
pcie_wait_for_link_status() to return success. We are not supposed to rely
on luck and need to make sure that LT transitioned through the inactive
state though before we initiate link training by hand via RL (Retrain Link)
bit.

Fixes: 1abb47390350 ("Merge branch 'pci/enumeration'")
Link: https://lore.kernel.org/r/20240423130820.43824-1-ilpo.jarvinen@linux.intel.com
Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-25 12:53:28 -05:00
Vidya Sagar
7bf9d2af7e PCI: Clear Secondary Status errors after enumeration
We enumerate devices by attempting config reads to the Vendor ID of each
possible device.  On conventional PCI, if no device responds, the read
terminates with a Master Abort (PCI r3.0, sec 6.1).  On PCIe, the config
read is terminated as an Unsupported Request (PCIe r6.0, sec 2.3.2,
7.5.1.3.7).  In either case, if the read addressed a device below a bridge,
it is logged by setting "Received Master Abort" in the bridge Secondary
Status register.

Clear any errors logged in the Secondary Status register after enumeration.

Link: https://lore.kernel.org/r/20240116143258.483235-1-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
[bhelgaas: simplify commit log]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-23 16:08:17 -05:00
Heiner Kallweit
c7ae396ec5 PCI: Annotate pci_cache_line_size variables as __ro_after_init
Annotate both variables as __ro_after_init, enforcing that they can't be
changed after the init phase.

Link: https://lore.kernel.org/r/52fd058d-6d72-48db-8e61-5fcddcd0aa51@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-18 13:46:19 -05:00
Heiner Kallweit
e30556bf68 PCI: Constify pcibus_class
Constify pcibus_class. All users take a const struct class * argument.

Link: https://lore.kernel.org/r/5e01f46f-266f-4fb3-be8a-8cb9e566cd75@gmail.com
Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2024-04-17 10:44:17 -05:00
Kuppuswamy Sathyanarayanan
a29e5290e3 PCI/AER: Update aer-inject tool source URL
The aer-inject tool is no longer maintained in the original repository
and is missing a fix related to the musl library. So, with the author's
(Huang Ying) consent, it has been moved to a new repository [1].

Update all references to the repository link.

Link: https://github.com/intel/aer-inject.git [1]
Link: https://lore.kernel.org/r/20240416055035.200085-1-sathyanarayanan.kuppuswamy@linux.intel.com
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Huang Ying <ying.huang@intel.com>
2024-04-17 10:35:43 -05:00
Sergio Paracuellos
fd6eb49a84
PCI: mt7621: Fix string truncation in mt7621_pcie_parse_port()
The following warning appears when driver is compiled with W=1.

CC      drivers/pci/controller/pcie-mt7621.o
drivers/pci/controller/pcie-mt7621.c: In function ‘mt7621_pcie_probe’:
drivers/pci/controller/pcie-mt7621.c:228:49: error: ‘snprintf’ output may
be truncated before the last format character [-Werror=format-truncation=]
228 |         snprintf(name, sizeof(name), "pcie-phy%d", slot);
    |                                                 ^
drivers/pci/controller/pcie-mt7621.c:228:9: note: ‘snprintf’ output between
10 and 11 bytes into a destination of size 10
228 |         snprintf(name, sizeof(name), "pcie-phy%d", slot);
    |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Clean this up increasing destination buffer one byte.

[kwilczynski: commit log]
Closes: https://lore.kernel.org/linux-pci/20240110212302.GA2123146@bhelgaas/T/#t
Link: https://lore.kernel.org/linux-pci/20240111082704.2259450-1-sergio.paracuellos@gmail.com
Reported-by: Bjorn Helgaas <helgaas@kernel.org>
Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
2024-04-11 11:33:48 +00:00
Vidya Sagar
19326006a2
PCI: tegra194: Fix probe path for Endpoint mode
Tegra194 PCIe probe path is taking failure path in success case for
Endpoint mode. Return success from the switch case instead of going
into the failure path.

Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194")
Link: https://lore.kernel.org/linux-pci/20240408093053.3948634-1-vidyas@nvidia.com
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Jon Hunter <jonathanh@nvidia.com>
2024-04-10 18:10:20 +00:00
Niklas Cassel
597ac0fa37
PCI: endpoint: pci-epf-test: Clean up pci_epf_test_unbind()
Clean up pci_epf_test_unbind() by using a continue if we did not allocate
memory for the BAR index. This reduces the indentation level by one.

This makes pci_epf_test_unbind() more similar to pci_epf_test_set_bar().

Link: https://lore.kernel.org/linux-pci/20240320113157.322695-6-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-04-10 18:02:04 +00:00
Niklas Cassel
e49eab944c
PCI: endpoint: pci-epf-test: Simplify pci_epf_test_set_bar() loop
Simplify the loop in pci_epf_test_set_bar().
If we allocated memory for the BAR, we need to call set_bar() for that
BAR, if we did not allocated memory for that BAR, we need to skip.
It is as simple as that. This also matches the logic in
pci_epf_test_unbind().

A 64-bit BAR will still only be one allocation, with the BAR succeeding
the 64-bit BAR being null.

While at it, remove the misleading comment.
A EPC .set_bar() callback should never change the epf_bar->flags.
(E.g. to set a 64-bit BAR if we requested a 32-bit BAR.)

A .set_bar() callback should do what we request it to do.
If it can't satisfy the request, it should return an error.

If platform has a specific requirement, e.g. that a certain BAR has to
be a 64-bit BAR, then it should specify that by setting the .only_64bit
flag for that specific BAR in epc_features->bar[], such that
pci_epf_alloc_space() will return a epf_bar with the 64-bit flag set.
(Such that .set_bar() will receive a request to set a 64-bit BAR.)

Link: https://lore.kernel.org/linux-pci/20240320113157.322695-5-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-04-10 18:01:46 +00:00
Niklas Cassel
828e870431
PCI: endpoint: pci-epf-test: Remove superfluous code
The only reason why we call pci_epf_configure_bar() is to set
PCI_BASE_ADDRESS_MEM_TYPE_64 in case the hardware requires it.

However, this flag is now automatically set when allocating a BAR that
can only be a 64-bit BAR, so we can drop pci_epf_configure_bar()
completely.

Link: https://lore.kernel.org/linux-pci/20240320113157.322695-4-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-04-10 18:01:25 +00:00
Niklas Cassel
29a025b6fb
PCI: endpoint: Allocate a 64-bit BAR if that is the only option
pci_epf_alloc_space() already sets the 64-bit flag if the BAR size is
larger than 2GB, even if the caller did not explicitly request a 64-bit
BAR.

Thus, let pci_epf_alloc_space() also set the 64-bit flag if the hardware
description says that the specific BAR can only be 64-bit.

Link: https://lore.kernel.org/linux-pci/20240320113157.322695-3-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-04-10 18:00:52 +00:00
Niklas Cassel
417660525d
PCI: endpoint: pci-epf-test: Simplify pci_epf_test_alloc_space() loop
Make pci-epf-test use pci_epc_get_next_free_bar() just like pci-epf-ntb.c
and pci-epf-vntb.c.

Using pci_epc_get_next_free_bar() also makes it more obvious that
pci-epf-test does no special configuration at all.

(The only configuration pci-epf-test does is setting
PCI_BASE_ADDRESS_MEM_TYPE_64 if epc_features has marked the specific BAR
as only_64bit. pci_epc_get_next_free_bar() already takes only_64bit into
account when looping.)

This way, the code is more consistent between EPF drivers, and pci-epf-test
does not need to explicitly check if the BAR is reserved, or if the index
belongs to a BAR succeeding a 64-bit only BAR.

Link: https://lore.kernel.org/linux-pci/20240320113157.322695-2-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-04-10 18:00:04 +00:00
Manivannan Sadhasivam
a01e7214be
PCI: endpoint: Remove "core_init_notifier" flag
"core_init_notifier" flag is set by the glue drivers requiring refclk from
the host to complete the DWC core initialization. Also, those drivers will
send a notification to the EPF drivers once the initialization is fully
completed using the pci_epc_init_notify() API. Only then, the EPF drivers
will start functioning.

For the rest of the drivers generating refclk locally, EPF drivers will
start functioning post binding with them. EPF drivers rely on the
'core_init_notifier' flag to differentiate between the drivers.
Unfortunately, this creates two different flows for the EPF drivers.

So to avoid that, let's get rid of the "core_init_notifier" flag and follow
a single initialization flow for the EPF drivers. This is done by calling
the dw_pcie_ep_init_notify() from all glue drivers after the completion of
dw_pcie_ep_init_registers() API. This will allow all the glue drivers to
send the notification to the EPF drivers once the initialization is fully
completed.

Only difference here is that, the drivers requiring refclk from host will
send the notification once refclk is received, while others will send it
during probe time itself.

But this also requires the EPC core driver to deliver the notification
after EPF driver bind. Because, the glue driver can send the notification
before the EPF drivers bind() and in those cases the EPF drivers will miss
the event. To accommodate this, EPC core is now caching the state of the
EPC initialization in 'init_complete' flag and pci-ep-cfs driver sends the
notification to EPF drivers based on that after each EPF driver bind.

Link: https://lore.kernel.org/linux-pci/20240327-pci-dbi-rework-v12-8-082625472414@linaro.org
Tested-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2024-04-10 17:52:42 +00:00
Manivannan Sadhasivam
df69e17ccc
PCI: dwc: ep: Call dw_pcie_ep_init_registers() API directly from all glue drivers
Currently, dw_pcie_ep_init_registers() API is directly called by the glue
drivers requiring active refclk from host. But for the other drivers, it is
getting called implicitly by dw_pcie_ep_init(). This is due to the fact
that this API initializes DWC EP specific registers and that requires an
active refclk (either from host or generated locally by endpoint itsef).

But, this causes a discrepancy among the glue drivers. So to avoid this
confusion, let's call this API directly from all glue drivers irrespective
of refclk dependency. Only difference here is that the drivers requiring
refclk from host will call this API only after the refclk is received and
other drivers without refclk dependency will call this API right after
dw_pcie_ep_init().

With this change, the check for 'core_init_notifier' flag can now be
dropped from dw_pcie_ep_init() API. This will also allow us to remove the
'core_init_notifier' flag completely in the later commits.

Link: https://lore.kernel.org/linux-pci/20240327-pci-dbi-rework-v12-7-082625472414@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
2024-04-10 17:52:11 +00:00
Manivannan Sadhasivam
7d6e64c443
PCI: dwc: ep: Rename dw_pcie_ep_init_complete() to dw_pcie_ep_init_registers()
The goal of the dw_pcie_ep_init_complete() API is to initialize the DWC
specific registers post registering the controller with the EP framework.

But the naming doesn't reflect its functionality and causes confusion. So,
let's rename it to dw_pcie_ep_init_registers() to make it clear that it
initializes the DWC specific registers.

Link: https://lore.kernel.org/linux-pci/20240327-pci-dbi-rework-v12-6-082625472414@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2024-04-10 17:51:28 +00:00
Manivannan Sadhasivam
570d7715ee
PCI: dwc: ep: Introduce dw_pcie_ep_cleanup() API for drivers supporting PERST#
For DWC glue drivers supporting PERST# (currently Qcom and Tegra194), some
of the DWC resources like eDMA should be cleaned up during the PERST#
assert time.

So let's introduce a dw_pcie_ep_cleanup() API that could be called by these
drivers to cleanup the DWC specific resources. Currently, it just removes
eDMA.

Closes: https://lore.kernel.org/linux-pci/ZWYmX8Y%2F7Q9WMxES@x1-carbon
Link: https://lore.kernel.org/linux-pci/20240327-pci-dbi-rework-v12-5-082625472414@linaro.org
Reported-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2024-04-10 17:50:06 +00:00
Manivannan Sadhasivam
c8682a3314
PCI: dwc: ep: Rename dw_pcie_ep_exit() to dw_pcie_ep_deinit()
dw_pcie_ep_exit() API is undoing what the dw_pcie_ep_init() API has done
already (at least partly). But the API name dw_pcie_ep_exit() is not quite
reflecting that. So let's rename it to dw_pcie_ep_deinit() to make the
purpose of this API clear. This also aligns with the DWC host driver.

Link: https://patchwork.kernel.org/project/linux-pci/patch/20240327-pci-dbi-rework-v12-4-082625472414@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
2024-04-10 17:49:21 +00:00
Manivannan Sadhasivam
b7dec6b850
PCI: dwc: ep: Remove deinit() callback from struct dw_pcie_ep_ops
deinit() callback was solely introduced for the pcie-rcar-gen4 driver where
it is used to do platform specific resource deallocation. And this callback
is called right at the end of the dw_pcie_ep_exit() API. So it doesn't
matter whether it is called within or outside of dw_pcie_ep_exit() API.

So let's remove this callback and directly call rcar_gen4_pcie_ep_deinit()
in pcie-rcar-gen4 driver to do resource deallocation after the completion
of dw_pcie_ep_exit() API in rcar_gen4_remove_dw_pcie_ep().

This simplifies the DWC layer.

Link: https://lore.kernel.org/linux-pci/20240327-pci-dbi-rework-v12-3-082625472414@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
2024-04-10 17:48:41 +00:00
Manivannan Sadhasivam
7cbebc86c7
PCI: dwc: ep: Add Kernel-doc comments for APIs
All of the APIs are missing the Kernel-doc comments. Hence, add them.

Link: https://lore.kernel.org/linux-pci/20240327-pci-dbi-rework-v12-2-082625472414@linaro.org
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2024-04-10 17:47:51 +00:00
Manivannan Sadhasivam
869bc52534
PCI: dwc: ep: Fix DBI access failure for drivers requiring refclk from host
The DWC glue drivers requiring an active reference clock from the PCIe host
for initializing their PCIe EP core, set a flag called 'core_init_notifier'
to let DWC driver know that these drivers need a special attention during
initialization. In these drivers, access to the hw registers (like DBI)
before receiving the active refclk from host will result in access failure
and also could cause a whole system hang.

But the current DWC EP driver doesn't honor the requirements of the drivers
setting 'core_init_notifier' flag and tries to access the DBI registers
during dw_pcie_ep_init(). This causes the system hang for glue drivers such
as Tegra194 and Qcom EP as they depend on refclk from host and have set the
above mentioned flag.

To workaround this issue, users of the affected platforms have to maintain
the dependency with the PCIe host by booting the PCIe EP after host boot.
But this won't provide a good user experience, since PCIe EP is _one_ of
the features of those platforms and it doesn't make sense to delay the
whole platform booting due to PCIe requiring active refclk.

So to fix this issue, let's move all the DBI access from
dw_pcie_ep_init() in the DWC EP driver to the dw_pcie_ep_init_complete()
API. This API will only be called by the drivers setting
'core_init_notifier' flag once refclk is received from host. For the rest
of the drivers that gets the refclk locally, this API will be called
within dw_pcie_ep_init().

Link: https://lore.kernel.org/linux-pci/20240327-pci-dbi-rework-v12-1-082625472414@linaro.org
Fixes: e966f7390da9 ("PCI: dwc: Refactor core initialization code for EP mode")
Co-developed-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Vidya Sagar <vidyas@nvidia.com>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Niklas Cassel <cassel@kernel.org>
2024-04-10 17:46:28 +00:00
Niklas Cassel
de66b37a17
PCI: rockchip-ep: Set a 64-bit BAR if requested
Ever since commit f25b5fae29d4 ("PCI: endpoint: Setting a BAR size > 4 GB
is invalid if 64-bit flag is not set") it has been impossible to get the
.set_bar() callback with a BAR size > 2 GB (yes, 2GB!), if the BAR was
also not requested to be configured as a 64-bit BAR.

It is however possible that an EPF driver configures a BAR as 64-bit,
even if the requested size is < 4 GB.

Respect the requested BAR configuration, just like how it is already
respected with regards to the prefetchable bit.

Link: https://lore.kernel.org/linux-pci/20240320113157.322695-8-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-04-10 17:32:31 +00:00
Niklas Cassel
07db0fa80c
PCI: cadence: Set a 64-bit BAR if requested
Ever since commit f25b5fae29d4 ("PCI: endpoint: Setting a BAR size > 4 GB
is invalid if 64-bit flag is not set") it has been impossible to get the
.set_bar() callback with a BAR size > 2 GB (yes, 2GB!), if the BAR was
also not requested to be configured as a 64-bit BAR.

Thus, forcing setting the 64-bit flag for BARs larger than 2 GB in the
lower level driver is dead code and can be removed.

It is however possible that an EPF driver configures a BAR as 64-bit,
even if the requested size is < 4 GB.

Respect the requested BAR configuration, just like how it is already
respected with regards to the prefetchable bit.

Link: https://lore.kernel.org/linux-pci/20240320113157.322695-7-cassel@kernel.org
Signed-off-by: Niklas Cassel <cassel@kernel.org>
Signed-off-by: Krzysztof Wilczyński <kwilczynski@kernel.org>
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
2024-04-10 17:31:16 +00:00
Mario Limonciello
256df20c59 PCI/PM: Avoid D3cold for HP Pavilion 17 PC/1972 PCIe Ports
Hewlett-Packard HP Pavilion 17 Notebook PC/1972 is an Intel Ivy Bridge
system with a muxless AMD Radeon dGPU.  Attempting to use the dGPU fails
with the following sequence:

  ACPI Error: Aborting method \AMD3._ON due to previous error (AE_AML_LOOP_TIMEOUT) (20230628/psparse-529)
  radeon 0000:01:00.0: not ready 1023ms after resume; waiting
  radeon 0000:01:00.0: not ready 2047ms after resume; waiting
  radeon 0000:01:00.0: not ready 4095ms after resume; waiting
  radeon 0000:01:00.0: not ready 8191ms after resume; waiting
  radeon 0000:01:00.0: not ready 16383ms after resume; waiting
  radeon 0000:01:00.0: not ready 32767ms after resume; waiting
  radeon 0000:01:00.0: not ready 65535ms after resume; giving up
  radeon 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible

The issue is that the Root Port the dGPU is connected to can't handle the
transition from D3cold to D0 so the dGPU can't properly exit runtime PM.

The existing logic in pci_bridge_d3_possible() checks for systems that are
newer than 2015 to decide that D3 is safe.  This would nominally work for
an Ivy Bridge system (which was discontinued in 2015), but this system
appears to have continued to receive BIOS updates until 2017 and so this
existing logic doesn't appropriately capture it.

Add the system to bridge_d3_blacklist to prevent D3cold from being used.

Link: https://lore.kernel.org/r/20240307163709.323-1-mario.limonciello@amd.com
Reported-by: Eric Heintzmann <heintzmann.eric@free.fr>
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3229
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Tested-by: Eric Heintzmann <heintzmann.eric@free.fr>
2024-04-10 11:38:23 -05:00
Alexey Kardashevskiy
eebab7e3eb PCI/DOE: Support discovery version 2
PCIe r6.1, sec 6.30.1.1 defines a "DOE Discovery Version" field in
the DOE Discovery Request Data Object Contents (3rd DW) as:

15:8 DOE Discovery Version – must be 02h if the Capability Version in
the Data Object Exchange Extended Capability is 02h or greater.

Add support for the version on devices with the DOE v2 capability.

Link: https://lore.kernel.org/r/20240307022006.3657433-1-aik@amd.com
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
2024-04-09 09:33:15 -05:00
Bjorn Helgaas
302b84e84d Revert "PCI: Mark LSI FW643 to avoid bus reset"
This reverts commit 29a43dc130ce65d365a8ea9e1cc4bc51005a353e.

29a43dc130ce ("PCI: Mark LSI FW643 to avoid bus reset") by Edmund was based
on the assumption that the LSI / Agere FW643 has a defect such that it
can't recover after a Secondary Bus Reset (SBR).

But Takashi Sakamoto reported that SBR works fine on this same FW643 device
in an AMD Ryzen 5 2400G system, so apparently there is some other aspect of
Edmund's system that accounts for the issue.

The down side of 29a43dc130ce is that when the FW643 is assigned to a VM,
avoiding the SBR means we leak data out of the VM.

Revert 29a43dc130ce until we figure out a better solution.  In the
meantime, we can use the sysfs "reset_method" interface to restrict the
available reset methods.

Link: https://lore.kernel.org/r/20240328212302.1582483-1-helgaas@kernel.org
Fixes: 29a43dc130ce ("PCI: Mark LSI FW643 to avoid bus reset")
Reported-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
Link: https://lore.kernel.org/r/20240325012135.36861-1-o-takashi@sakamocchi.jp
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
2024-03-29 11:57:12 -05:00
Kai-Heng Feng
eeee3b5e6d PCI: Mask Replay Timer Timeout errors for Genesys GL975x SD host controller
Due to a hardware defect in GL975x, config accesses when ASPM is enabled
frequently cause Replay Timer Timeouts in the Port leading to the device.

These are Correctable Errors, so the Downstream Port logs it in its AER
Correctable Error Status register and, when the error is not masked, sends
an ERR_COR message upstream.  The message terminates at a Root Port, which
may generate an AER interrupt so the OS can log it.

The Correctable Error logging is an annoyance but not a major issue itself.
But when the AER interrupt happens during suspend, it can prevent the
system from suspending.

015c9cbcf0ad ("mmc: sdhci-pci-gli: GL9750: Mask the replay timer timeout of
AER") masked these errors in the GL975x itself.

Mask these errors in the Port leading to GL975x as well.  Note that Replay
Timer Timeouts will still be logged in the AER Correctable Error Status
register, but they will not cause AER interrupts.

Link: https://lore.kernel.org/r/20240327024509.1071189-1-kai.heng.feng@canonical.com
Signed-off-by: Kai-Heng Feng <kai.heng.feng@canonical.com>
[bhelgaas: commit log, update dmesg note]
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Victor Shih <victor.shih@genesyslogic.com.tw>
Cc: Ben Chuang <benchuanggli@gmail.com>
2024-03-27 10:16:46 -05:00
Linus Torvalds
705c1da8fa pci-v6.9-changes
-----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEgMe7l+5h9hnxdsnuWYigwDrT+vwFAmXw04sUHGJoZWxnYWFz
 QGdvb2dsZS5jb20ACgkQWYigwDrT+vyT3xAAsp5+c2IcbrXpZZM7figwx4y9PPRp
 jcQ4AYSGP41xqTUGXUTcVZYvRorSIAFEOz33U0SL1UNxoOZz8j/M6SD58k8a6XRr
 9SSPuKja1OKJjONhS1bzrcbVtuzr71ISrECXfLkvW5hY5hvq+3+anMtu3/UIEHu6
 M1vVc+basRjjPJNTixMWvVqS3R+4gDAFeBtdZl/D+U6v0v2xOK+81YZqjfZZCw9v
 xmdHHK2dKNEdysNoRJ5cafY3b1NnSsrxlHbIhBnKt+7uRSWKD1dHcBQj7wDc/HrX
 yBGca+BZBKitXEJM3p5KcWWs4ijaywGw0GSffUIKrN9i6RIfwnxBH9jUbwDngifU
 2IR/kLEqdjYi/WnENxIHpQATLyXhXZ8uEnLS0xMlRIA96u3M5B0mrYOZxaN3bo12
 A3aE+aPOTw0u1wf7G8dBX6IdYnjZ/ZuR9Q+fVoKpZBvsYUVaKyiKCtKMCNaVirn5
 z7nxR1W71ee+35+37KthPXhiw+YtURGz1wBWt+wWUMjBcpIj2bpzU9wQDE9ZMdYt
 XJoJcatrRhJzefO3uzd0egft+vwk0xrj5LQEDhMQyDrnBLC4EgI5niKPWqbay5Nx
 Cnll01CI82xAnIF6eu7OOuI1nYGtoFcY8rP3hTC85cWN7Xi8SAOLTZZcVTpfBMUr
 l2uEll8p+8dZ6IY=
 =AP3I
 -----END PGP SIGNATURE-----

Merge tag 'pci-v6.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci

Pull PCI updates from Bjorn Helgaas:
 "Enumeration:

   - Consolidate interrupt related code in irq.c (Ilpo Järvinen)

   - Reduce kernel size by replacing sysfs resource macros with
     functions (Ilpo Järvinen)

   - Reduce kernel size by compiling sysfs support only when
     CONFIG_SYSFS=y (Lukas Wunner)

   - Avoid using Extended Tags on 3ware-9650SE Root Port to work around
     an apparent hardware defect (Jörg Wedekind)

  Resource management:

   - Fix an MMIO mapping leak in pci_iounmap() (Philipp Stanner)

   - Move pci_iomap.c and other PCI-specific devres code to drivers/pci
     (Philipp Stanner)

   - Consolidate PCI devres code in devres.c (Philipp Stanner)

  Power management:

   - Avoid D3cold on Asus B1400 PCI-NVMe bridge, where firmware doesn't
     know how to return correctly to D0, and remove previous quirk that
     wasn't as specific (Daniel Drake)

   - Allow runtime PM when the driver enables it but doesn't need any
     runtime PM callbacks (Raag Jadav)

   - Drain runtime-idle callbacks before driver removal to avoid races
     between .remove() and .runtime_idle(), which caused intermittent
     page faults when the rtsx .runtime_idle() accessed registers that
     its .remove() had already unmapped (Rafael J. Wysocki)

  Virtualization:

   - Avoid Secondary Bus Reset on LSI FW643 so it can be assigned to VMs
     with VFIO, e.g., for professional audio software on many Apple
     machines, at the cost of leaking state between VMs (Edmund Raile)

  Error handling:

   - Print all logged TLP Prefixes, not just the first, after AER or DPC
     errors (Ilpo Järvinen)

   - Quirk the DPC PIO log size for Intel Raptor Lake Root Ports, which
     still don't advertise a legal size (Paul Menzel)

   - Ignore expected DPC Surprise Down errors on hot removal (Smita
     Koralahalli)

   - Block runtime suspend while handling AER errors to avoid races that
     prevent the device form being resumed from D3hot (Stanislaw
     Gruszka)

  Peer-to-peer DMA:

   - Use atomic XA allocation in RCU read section (Christophe JAILLET)

  ASPM:

   - Collect bits of ASPM-related code that we need even without
     CONFIG_PCIEASPM into aspm.c (David E. Box)

   - Save/restore L1 PM Substates config for suspend/resume (David E.
     Box)

   - Update save_save when ASPM config is changed, so a .slot_reset()
     during error recovery restores the changed config, not the
     .probe()-time config (Vidya Sagar)

  Endpoint framework:

   - Refactor and improve pci_epf_alloc_space() API (Niklas Cassel)

   - Clean up endpoint BAR descriptions (Niklas Cassel)

   - Fix ntb_register_device() name leak in error path (Yang Yingliang)

   - Return actual error code for pci_vntb_probe() failure (Yang
     Yingliang)

  Broadcom STB PCIe controller driver:

   - Fix MDIO write polling, which previously never waited for
     completion (Jonathan Bell)

  Cadence PCIe endpoint driver:

   - Clear the ARI "Next Function Number" of last function (Jasko-EXT
     Wojciech)

  Freescale i.MX6 PCIe controller driver:

   - Simplify by replacing switch statements with function pointers for
     different hardware variants (Frank Li)

   - Simplify by using clk_bulk*() API (Frank Li)

   - Remove redundant DT clock and reg/reg-name details (Frank Li)

   - Add i.MX95 DT and driver support for both Root Complex and Endpoint
     mode (Frank Li)

  Microsoft Hyper-V host bridge driver:

   - Reduce memory usage by limiting ring buffer size to 16KB instead of
     4 pages (Michael Kelley)

  Qualcomm PCIe controller driver:

   - Add X1E80100 DT and driver support (Abel Vesa)

   - Add DT 'required-opps' for SoCs that require a minimum performance
     level (Johan Hovold)

   - Make DT 'msi-map-mask' optional, depending on how MSI interrupts
     are mapped (Johan Hovold)

   - Disable ASPM L0s for sc8280xp, sa8540p and sa8295p because the PHY
     configuration isn't tuned correctly for L0s (Johan Hovold)

   - Split dt-binding qcom,pcie.yaml into qcom,pcie-common.yaml and
     separate files for SA8775p, SC7280, SC8180X, SC8280XP, SM8150,
     SM8250, SM8350, SM8450, SM8550 for easier reviewing (Krzysztof
     Kozlowski)

   - Enable BDF to SID translation by disabling bypass mode (Manivannan
     Sadhasivam)

   - Add endpoint MHI support for Snapdragon SA8775P SoC (Mrinmay
     Sarkar)

  Synopsys DesignWare PCIe controller driver:

   - Allocate 64-bit MSI address if no 32-bit address is available (Ajay
     Agarwal)

   - Fix endpoint Resizable BAR to actually advertise the required 1MB
     size (Niklas Cassel)

  MicroSemi Switchtec management driver:

   - Release resources if the .probe() fails (Christophe JAILLET)

  Miscellaneous:

   - Make pcie_port_bus_type const (Ricardo B. Marliere)"

* tag 'pci-v6.9-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci: (77 commits)
  PCI/ASPM: Update save_state when configuration changes
  PCI/ASPM: Disable L1 before configuring L1 Substates
  PCI/ASPM: Call pci_save_ltr_state() from pci_save_pcie_state()
  PCI/ASPM: Save L1 PM Substates Capability for suspend/resume
  PCI: hv: Fix ring buffer size calculation
  PCI: dwc: endpoint: Fix advertised resizable BAR size
  PCI: cadence: Clear the ARI Capability Next Function Number of the last function
  PCI: dwc: Strengthen the MSI address allocation logic
  PCI: brcmstb: Fix broken brcm_pcie_mdio_write() polling
  PCI: qcom: Add X1E80100 PCIe support
  dt-bindings: PCI: qcom: Document the X1E80100 PCIe Controller
  PCI: qcom: Enable BDF to SID translation properly
  PCI/AER: Generalize TLP Header Log reading
  PCI/AER: Use explicit register size for PCI_ERR_CAP
  PCI: qcom: Disable ASPM L0s for sc8280xp, sa8540p and sa8295p
  dt-bindings: PCI: qcom: Do not require 'msi-map-mask'
  dt-bindings: PCI: qcom: Allow 'required-opps'
  PCI/AER: Block runtime suspend when handling errors
  PCI/ASPM: Move pci_save_ltr_state() to aspm.c
  PCI/ASPM: Always build aspm.c
  ...
2024-03-14 10:58:27 -07:00
Linus Torvalds
07abb19a9b Power management updates for 6.9-rc1
- Allow the Energy Model to be updated dynamically (Lukasz Luba).
 
  - Add support for LZ4 compression algorithm to the hibernation image
    creation and loading code (Nikhil V).
 
  - Fix and clean up system suspend statistics collection (Rafael
    Wysocki).
 
  - Simplify device suspend and resume handling in the power management
    core code (Rafael Wysocki).
 
  - Fix PCI hibernation support description (Yiwei Lin).
 
  - Make hibernation take set_memory_ro() return values into account as
    appropriate (Christophe Leroy).
 
  - Set mem_sleep_current during kernel command line setup to avoid an
    ordering issue with handling it (Maulik Shah).
 
  - Fix wake IRQs handling when pm_runtime_force_suspend() is used as a
    driver's system suspend callback (Qingliang Li).
 
  - Simplify pm_runtime_get_if_active() usage and add a replacement for
    pm_runtime_put_autosuspend() (Sakari Ailus).
 
  - Add a tracepoint for runtime_status changes tracking (Vilas Bhat).
 
  - Fix section title markdown in the runtime PM documentation (Yiwei
    Lin).
 
  - Enable preferred core support in the amd-pstate cpufreq driver (Meng
    Li).
 
  - Fix min_perf assignment in amd_pstate_adjust_perf() and make the
    min/max limit perf values in amd-pstate always stay within the
    (highest perf, lowest perf) range (Tor Vic, Meng Li).
 
  - Allow intel_pstate to assign model-specific values to strings used in
    the EPP sysfs interface and make it do so on Meteor Lake (Srinivas
    Pandruvada).
 
  - Drop long-unused cpudata::prev_cummulative_iowait from the
    intel_pstate cpufreq driver (Jiri Slaby).
 
  - Prevent scaling_cur_freq from exceeding scaling_max_freq when the
    latter is an inefficient frequency (Shivnandan Kumar).
 
  - Change default transition delay in cpufreq to 2ms (Qais Yousef).
 
  - Remove references to 10ms minimum sampling rate from comments in the
    cpufreq code (Pierre Gondois).
 
  - Honour transition_latency over transition_delay_us in cpufreq (Qais
    Yousef).
 
  - Stop unregistering cpufreq cooling on CPU hot-remove (Viresh Kumar).
 
  - General enhancements / cleanups to ARM cpufreq drivers (tianyu2,
    Nícolas F. R. A. Prado, Erick Archer, Arnd Bergmann, Anastasia
    Belova).
 
  - Update cpufreq-dt-platdev to block/approve devices (Richard Acayan).
 
  - Make the SCMI cpufreq driver get a transition delay value from
    firmware (Pierre Gondois).
 
  - Prevent the haltpoll cpuidle governor from shrinking guest
    poll_limit_ns below grow_start (Parshuram Sangle).
 
  - Avoid potential overflow in integer multiplication when computing
    cpuidle state parameters (C Cheng).
 
  - Adjust MWAIT hint target C-state computation in the ACPI cpuidle
    driver and in intel_idle to return a correct value for C0 (He
    Rongguang).
 
  - Address multiple issues in the TPMI RAPL driver and add support for
    new platforms (Lunar Lake-M, Arrow Lake) to Intel RAPL (Zhang Rui).
 
  - Fix freq_qos_add_request() return value check in dtpm_cpu (Daniel
    Lezcano).
 
  - Fix kernel-doc for dtpm_create_hierarchy() (Yang Li).
 
  - Fix file leak in get_pkg_num() in x86_energy_perf_policy (Samasth
    Norway Ananda).
 
  - Fix cpupower-frequency-info.1 man page typo (Jan Kratochvil).
 
  - Fix a couple of warnings in the OPP core code related to W=1
    builds (Viresh Kumar).
 
  - Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h (Viresh
    Kumar).
 
  - Extend dev_pm_opp_data with turbo support (Sibi Sankar).
 
  - dt-bindings: drop maxItems from inner items (David Heidelberg).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmXvI/ISHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx24sP/jxg6fOGme8raHQvpTXG3/H56wlGzQ4P
 YUvvKUXnfD3yf1zNISsUl7VQebZqDt8rygkwSdymXlUVZX1eubN0RpCFc0F8GZuc
 THG/YQhYQr/9zro3FpKhfDj5evk21PCQzjf+dGvfQF9qVMxNPG1JzEFK6PnolT5X
 2BvkonY1XFWZjCMbZ83B/jt35lTDb0cmeNbCpfD5UJgcnxmMOtZYpORdyfPWTJpG
 GVCwmAFVVXxXlust/AIpt3mmOpKzSA9GnrtJkhtQe5GN+Y4OjnJiFJmTC7EfCctj
 JlWgVUA716mtFMUrjXgjfI54firF2oQpqaSa2HG/V/A96JWQqjarGz5dAV1IrPEt
 ZmYpvMe4E90S411wF1OWyrEqjXUuDnH1OWUvUdWSt4E7DhFw3esDi/jLW2tyVKAT
 hIy+/O4wzbDSTX/h9Cgt1Qjhew6lKUIwvhEXclB3fuJ+JoviWNkC9lnK93e2H0A3
 VYfkd/lpUD74035l0FrCJ/49MjX9kqrsn+TipHsIlSXAi8ZRdKbVvxOTD8RYudcI
 GvCiDDrkMgNwGlyedgbtTBUepCvSg93b+vVmRj7YMPtBhioOUo3qCn6wpqhxfnth
 9BCnPW7JxqUw/NJdlk9hKumaUZq+MK8G+kdYcIDg6xmAkWSUVP2QKlWavfMCxqRP
 +dN6T2iHsKFe
 =UePT
 -----END PGP SIGNATURE-----

Merge tag 'pm-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "From the functional perspective, the most significant change here is
  the addition of support for Energy Models that can be updated
  dynamically at run time.

  There is also the addition of LZ4 compression support for hibernation,
  the new preferred core support in amd-pstate, new platforms support in
  the Intel RAPL driver, new model-specific EPP handling in intel_pstate
  and more.

  Apart from that, the cpufreq default transition delay is reduced from
  10 ms to 2 ms (along with some related adjustments), the system
  suspend statistics code undergoes a significant rework and there is a
  usual bunch of fixes and code cleanups all over.

  Specifics:

   - Allow the Energy Model to be updated dynamically (Lukasz Luba)

   - Add support for LZ4 compression algorithm to the hibernation image
     creation and loading code (Nikhil V)

   - Fix and clean up system suspend statistics collection (Rafael
     Wysocki)

   - Simplify device suspend and resume handling in the power management
     core code (Rafael Wysocki)

   - Fix PCI hibernation support description (Yiwei Lin)

   - Make hibernation take set_memory_ro() return values into account as
     appropriate (Christophe Leroy)

   - Set mem_sleep_current during kernel command line setup to avoid an
     ordering issue with handling it (Maulik Shah)

   - Fix wake IRQs handling when pm_runtime_force_suspend() is used as a
     driver's system suspend callback (Qingliang Li)

   - Simplify pm_runtime_get_if_active() usage and add a replacement for
     pm_runtime_put_autosuspend() (Sakari Ailus)

   - Add a tracepoint for runtime_status changes tracking (Vilas Bhat)

   - Fix section title markdown in the runtime PM documentation (Yiwei
     Lin)

   - Enable preferred core support in the amd-pstate cpufreq driver
     (Meng Li)

   - Fix min_perf assignment in amd_pstate_adjust_perf() and make the
     min/max limit perf values in amd-pstate always stay within the
     (highest perf, lowest perf) range (Tor Vic, Meng Li)

   - Allow intel_pstate to assign model-specific values to strings used
     in the EPP sysfs interface and make it do so on Meteor Lake
     (Srinivas Pandruvada)

   - Drop long-unused cpudata::prev_cummulative_iowait from the
     intel_pstate cpufreq driver (Jiri Slaby)

   - Prevent scaling_cur_freq from exceeding scaling_max_freq when the
     latter is an inefficient frequency (Shivnandan Kumar)

   - Change default transition delay in cpufreq to 2ms (Qais Yousef)

   - Remove references to 10ms minimum sampling rate from comments in
     the cpufreq code (Pierre Gondois)

   - Honour transition_latency over transition_delay_us in cpufreq (Qais
     Yousef)

   - Stop unregistering cpufreq cooling on CPU hot-remove (Viresh Kumar)

   - General enhancements / cleanups to ARM cpufreq drivers (tianyu2,
     Nícolas F. R. A. Prado, Erick Archer, Arnd Bergmann, Anastasia
     Belova)

   - Update cpufreq-dt-platdev to block/approve devices (Richard Acayan)

   - Make the SCMI cpufreq driver get a transition delay value from
     firmware (Pierre Gondois)

   - Prevent the haltpoll cpuidle governor from shrinking guest
     poll_limit_ns below grow_start (Parshuram Sangle)

   - Avoid potential overflow in integer multiplication when computing
     cpuidle state parameters (C Cheng)

   - Adjust MWAIT hint target C-state computation in the ACPI cpuidle
     driver and in intel_idle to return a correct value for C0 (He
     Rongguang)

   - Address multiple issues in the TPMI RAPL driver and add support for
     new platforms (Lunar Lake-M, Arrow Lake) to Intel RAPL (Zhang Rui)

   - Fix freq_qos_add_request() return value check in dtpm_cpu (Daniel
     Lezcano)

   - Fix kernel-doc for dtpm_create_hierarchy() (Yang Li)

   - Fix file leak in get_pkg_num() in x86_energy_perf_policy (Samasth
     Norway Ananda)

   - Fix cpupower-frequency-info.1 man page typo (Jan Kratochvil)

   - Fix a couple of warnings in the OPP core code related to W=1 builds
     (Viresh Kumar)

   - Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h (Viresh
     Kumar)

   - Extend dev_pm_opp_data with turbo support (Sibi Sankar)

   - dt-bindings: drop maxItems from inner items (David Heidelberg)"

* tag 'pm-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (95 commits)
  dt-bindings: opp: drop maxItems from inner items
  OPP: debugfs: Fix warning around icc_get_name()
  OPP: debugfs: Fix warning with W=1 builds
  cpufreq: Move dev_pm_opp_{init|free}_cpufreq_table() to pm_opp.h
  OPP: Extend dev_pm_opp_data with turbo support
  Fix cpupower-frequency-info.1 man page typo
  cpufreq: scmi: Set transition_delay_us
  firmware: arm_scmi: Populate fast channel rate_limit
  firmware: arm_scmi: Populate perf commands rate_limit
  cpuidle: ACPI/intel: fix MWAIT hint target C-state computation
  PM: sleep: wakeirq: fix wake irq warning in system suspend
  powercap: dtpm: Fix kernel-doc for dtpm_create_hierarchy() function
  cpufreq: Don't unregister cpufreq cooling on CPU hotplug
  PM: suspend: Set mem_sleep_current during kernel command line setup
  cpufreq: Honour transition_latency over transition_delay_us
  cpufreq: Limit resolving a frequency to policy min/max
  Documentation: PM: Fix runtime_pm.rst markdown syntax
  cpufreq: amd-pstate: adjust min/max limit perf
  cpufreq: Remove references to 10ms min sampling rate
  cpufreq: intel_pstate: Update default EPPs for Meteor Lake
  ...
2024-03-13 11:40:06 -07:00
Linus Torvalds
8c9c2f851b IOMMU Updates for Linux v6.9
Including:
 
 	- Core changes:
 	  - Constification of bus_type pointer
 	  - Preparations for user-space page-fault delivery
 	  - Use a named kmem_cache for IOVA magazines
 
 	- Intel VT-d changes from Lu Baolu:
 	  - Add RBTree to track iommu probed devices
 	  - Add Intel IOMMU debugfs document
 	  - Cleanup and refactoring
 
 	- ARM-SMMU Updates from Will Deacon:
 	  - Device-tree binding updates for a bunch of Qualcomm SoCs
 	  - SMMUv2: Support for Qualcomm X1E80100 MDSS
 	  - SMMUv3: Significant rework of the driver's STE manipulation and
 	    domain handling code. This is the initial part of a larger scale
 	    rework aiming to improve the driver's implementation of the
 	    IOMMU-API in preparation for hooking up IOMMUFD support.
 
 	- AMD-Vi Updates:
 	  - Refactor GCR3 table support for SVA
 	  - Cleanups
 
 	- Some smaller cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEr9jSbILcajRFYWYyK/BELZcBGuMFAmXuyf8ACgkQK/BELZcB
 GuNXwxAApkjDm7VWM2D2K8Y+8YLbtaljMCCudNZKhgT++HEo4YlXcA5NmOddMIFc
 qhF9EwAWlQfj3krJLJQSZ6v/joKpXSwS6LDYuEGmJ/pIGfN5HqaTsOCItriP7Mle
 ZgRTI28u5ykZt4b6IKG8QeexilQi2DsIxT46HFiHL0GrvcBcdxDuKnE22PNCTwU2
 25WyJzgo//Ht2BrwlhrduZVQUh0KzXYuV5lErvoobmT0v/a4llS20ov+IE/ut54w
 FxIqGR8rMdJ9D2dM0bWRkdJY/vJxokah2QHm0gcna3Gr2iENL2xWFUtm+j1B6Smb
 VuxbwMkB0Iz530eShebmzQ07e2f1rRb4DySriu4m/jb8we20AYqKMYaxQxZkU68T
 1hExo+/QJQil9p1t+7Eur+S1u6gRHOdqfBnCzGOth/zzY1lbEzpdp8b9M8wnGa4K
 Y0EDeUpKtVIP1ZRCBi8CGyU1jgJF13Nx7MnOalgGWjDysB5RPamnrhz71EuD6rLw
 Jxp2EYo8NQPmPbEcl9NDS+oOn5Fz5TyPiMF2GUzhb9KisLxUjriLoTaNyBsdFkds
 2q+x6KY8qPGk37NhN0ktfpk9CtSGN47Pm8ZznEkFt9AR96GJDX+3NhUNAwEKslwt
 1tavDmmdOclOfIpWtaMlKQTHGhuSBZo1A40ATeM/MjHQ8rEtwXk=
 =HV07
 -----END PGP SIGNATURE-----

Merge tag 'iommu-updates-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu

Pull iommu updates from Joerg Roedel:
 "Core changes:
    - Constification of bus_type pointer
    - Preparations for user-space page-fault delivery
    - Use a named kmem_cache for IOVA magazines

  Intel VT-d changes from Lu Baolu:
    - Add RBTree to track iommu probed devices
    - Add Intel IOMMU debugfs document
    - Cleanup and refactoring

  ARM-SMMU Updates from Will Deacon:
    - Device-tree binding updates for a bunch of Qualcomm SoCs
    - SMMUv2: Support for Qualcomm X1E80100 MDSS
    - SMMUv3: Significant rework of the driver's STE manipulation and
      domain handling code. This is the initial part of a larger scale
      rework aiming to improve the driver's implementation of the
      IOMMU-API in preparation for hooking up IOMMUFD support.

  AMD-Vi Updates:
    - Refactor GCR3 table support for SVA
    - Cleanups

  Some smaller cleanups and fixes"

* tag 'iommu-updates-v6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu: (88 commits)
  iommu: Fix compilation without CONFIG_IOMMU_INTEL
  iommu/amd: Fix sleeping in atomic context
  iommu/dma: Document min_align_mask assumption
  iommu/vt-d: Remove scalabe mode in domain_context_clear_one()
  iommu/vt-d: Remove scalable mode context entry setup from attach_dev
  iommu/vt-d: Setup scalable mode context entry in probe path
  iommu/vt-d: Fix NULL domain on device release
  iommu: Add static iommu_ops->release_domain
  iommu/vt-d: Improve ITE fault handling if target device isn't present
  iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected
  PCI: Make pci_dev_is_disconnected() helper public for other drivers
  iommu/vt-d: Use device rbtree in iopf reporting path
  iommu/vt-d: Use rbtree to track iommu probed devices
  iommu/vt-d: Merge intel_svm_bind_mm() into its caller
  iommu/vt-d: Remove initialization for dynamically heap-allocated rcu_head
  iommu/vt-d: Remove treatment for revoking PASIDs with pending page faults
  iommu/vt-d: Add the document for Intel IOMMU debugfs
  iommu/vt-d: Use kcalloc() instead of kzalloc()
  iommu/vt-d: Remove INTEL_IOMMU_BROKEN_GFX_WA
  iommu: re-use local fwnode variable in iommu_ops_from_fwnode()
  ...
2024-03-13 09:15:30 -07:00
Linus Torvalds
9187210eee Networking changes for 6.9.
Core & protocols
 ----------------
 
  - Large effort by Eric to lower rtnl_lock pressure and remove locks:
 
    - Make commonly used parts of rtnetlink (address, route dumps etc.)
      lockless, protected by RCU instead of rtnl_lock.
 
    - Add a netns exit callback which already holds rtnl_lock,
      allowing netns exit to take rtnl_lock once in the core
      instead of once for each driver / callback.
 
    - Remove locks / serialization in the socket diag interface.
 
    - Remove 6 calls to synchronize_rcu() while holding rtnl_lock.
 
    - Remove the dev_base_lock, depend on RCU where necessary.
 
  - Support busy polling on a per-epoll context basis. Poll length
    and budget parameters can be set independently of system defaults.
 
  - Introduce struct net_hotdata, to make sure read-mostly global config
    variables fit in as few cache lines as possible.
 
  - Add optional per-nexthop statistics to ease monitoring / debug
    of ECMP imbalance problems.
 
  - Support TCP_NOTSENT_LOWAT in MPTCP.
 
  - Ensure that IPv6 temporary addresses' preferred lifetimes are long
    enough, compared to other configured lifetimes, and at least 2 sec.
 
  - Support forwarding of ICMP Error messages in IPSec, per RFC 4301.
 
  - Add support for the independent control state machine for bonding
    per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled
    control state machine.
 
  - Add "network ID" to MCTP socket APIs to support hosts with multiple
    disjoint MCTP networks.
 
  - Re-use the mono_delivery_time skbuff bit for packets which user
    space wants to be sent at a specified time. Maintain the timing
    information while traversing veth links, bridge etc.
 
  - Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets.
 
  - Simplify many places iterating over netdevs by using an xarray
    instead of a hash table walk (hash table remains in place, for
    use on fastpaths).
 
  - Speed up scanning for expired routes by keeping a dedicated list.
 
  - Speed up "generic" XDP by trying harder to avoid large allocations.
 
  - Support attaching arbitrary metadata to netconsole messages.
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Enforce VM_IOREMAP flag and range in ioremap_page_range and introduce
    VM_SPARSE kind and vm_area_[un]map_pages (used by bpf_arena).
 
  - Rework selftest harness to enable the use of the full range of
    ksft exit code (pass, fail, skip, xfail, xpass).
 
 Netfilter
 ---------
 
  - Allow userspace to define a table that is exclusively owned by a daemon
    (via netlink socket aliveness) without auto-removing this table when
    the userspace program exits. Such table gets marked as orphaned and
    a restarting management daemon can re-attach/regain ownership.
 
  - Speed up element insertions to nftables' concatenated-ranges set type.
    Compact a few related data structures.
 
 BPF
 ---
 
  - Add BPF token support for delegating a subset of BPF subsystem
    functionality from privileged system-wide daemons such as systemd
    through special mount options for userns-bound BPF fs to a trusted
    & unprivileged application.
 
  - Introduce bpf_arena which is sparse shared memory region between BPF
    program and user space where structures inside the arena can have
    pointers to other areas of the arena, and pointers work seamlessly
    for both user-space programs and BPF programs.
 
  - Introduce may_goto instruction that is a contract between the verifier
    and the program. The verifier allows the program to loop assuming it's
    behaving well, but reserves the right to terminate it.
 
  - Extend the BPF verifier to enable static subprog calls in spin lock
    critical sections.
 
  - Support registration of struct_ops types from modules which helps
    projects like fuse-bpf that seeks to implement a new struct_ops type.
 
  - Add support for retrieval of cookies for perf/kprobe multi links.
 
  - Support arbitrary TCP SYN cookie generation / validation in the TC
    layer with BPF to allow creating SYN flood handling in BPF firewalls.
 
  - Add code generation to inline the bpf_kptr_xchg() helper which
    improves performance when stashing/popping the allocated BPF objects.
 
 Wireless
 --------
 
  - Add SPP (signaling and payload protected) AMSDU support.
 
  - Support wider bandwidth OFDMA, as required for EHT operation.
 
 Driver API
 ----------
 
  - Major overhaul of the Energy Efficient Ethernet internals to support
    new link modes (2.5GE, 5GE), share more code between drivers
    (especially those using phylib), and encourage more uniform behavior.
    Convert and clean up drivers.
 
  - Define an API for querying per netdev queue statistics from drivers.
 
  - IPSec: account in global stats for fully offloaded sessions.
 
  - Create a concept of Ethernet PHY Packages at the Device Tree level,
    to allow parameterizing the existing PHY package code.
 
  - Enable Rx hashing (RSS) on GTP protocol fields.
 
 Misc
 ----
 
  - Improvements and refactoring all over networking selftests.
 
  - Create uniform module aliases for TC classifiers, actions,
    and packet schedulers to simplify creating modprobe policies.
 
  - Address all missing MODULE_DESCRIPTION() warnings in networking.
 
  - Extend the Netlink descriptions in YAML to cover message encapsulation
    or "Netlink polymorphism", where interpretation of nested attributes
    depends on link type, classifier type or some other "class type".
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Add a new driver for Marvell's Octeon PCI Endpoint NIC VF.
    - Intel (100G, ice, idpf):
      - support E825-C devices
    - nVidia/Mellanox:
      - support devices with one port and multiple PCIe links
    - Broadcom (bnxt):
      - support n-tuple filters
      - support configuring the RSS key
    - Wangxun (ngbe/txgbe):
      - implement irq_domain for TXGBE's sub-interrupts
    - Pensando/AMD:
      - support XDP
      - optimize queue submission and wakeup handling (+17% bps)
      - optimize struct layout, saving 28% of memory on queues
 
  - Ethernet NICs embedded and virtual:
    - Google cloud vNIC:
      - refactor driver to perform memory allocations for new queue
        config before stopping and freeing the old queue memory
    - Synopsys (stmmac):
      - obey queueMaxSDU and implement counters required by 802.1Qbv
    - Renesas (ravb):
      - support packet checksum offload
      - suspend to RAM and runtime PM support
 
  - Ethernet switches:
    - nVidia/Mellanox:
      - support for nexthop group statistics
    - Microchip:
      - ksz8: implement PHY loopback
      - add support for KSZ8567, a 7-port 10/100Mbps switch
 
  - PTP:
    - New driver for RENESAS FemtoClock3 Wireless clock generator.
    - Support OCP PTP cards designed and built by Adva.
 
  - CAN:
    - Support recvmsg() flags for own, local and remote traffic
      on CAN BCM sockets.
    - Support for esd GmbH PCIe/402 CAN device family.
    - m_can:
      - Rx/Tx submission coalescing
      - wake on frame Rx
 
  - WiFi:
    - Intel (iwlwifi):
      - enable signaling and payload protected A-MSDUs
      - support wider-bandwidth OFDMA
      - support for new devices
      - bump FW API to 89 for AX devices; 90 for BZ/SC devices
    - MediaTek (mt76):
      - mt7915: newer ADIE version support
      - mt7925: radio temperature sensor support
    - Qualcomm (ath11k):
      - support 6 GHz station power modes: Low Power Indoor (LPI),
        Standard Power) SP and Very Low Power (VLP)
      - QCA6390 & WCN6855: support 2 concurrent station interfaces
      - QCA2066 support
    - Qualcomm (ath12k):
      - refactoring in preparation for Multi-Link Operation (MLO) support
      - 1024 Block Ack window size support
      - firmware-2.bin support
      - support having multiple identical PCI devices (firmware needs to
        have ATH12K_FW_FEATURE_MULTI_QRTR_ID)
      - QCN9274: support split-PHY devices
      - WCN7850: enable Power Save Mode in station mode
      - WCN7850: P2P support
    - RealTek:
      - rtw88: support for more rtw8811cu and rtw8821cu devices
      - rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL
      - rtlwifi: speed up USB firmware initialization
      - rtwl8xxxu:
        - RTL8188F: concurrent interface support
        - Channel Switch Announcement (CSA) support in AP mode
    - Broadcom (brcmfmac):
      - per-vendor feature support
      - per-vendor SAE password setup
      - DMI nvram filename quirk for ACEPC W5 Pro
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmXv0mgACgkQMUZtbf5S
 IrtgMxAAuRd+WJW++SENr4KxIWhYO1q6Xcxnai43wrNkan9swD24icG8TYALt4f3
 yoT6idQvWReAb5JNlh9rUQz8R7E0nJXlvEFn5MtJwcthx2C6wFo/XkJlddlRrT+j
 c2xGILwLjRhW65LaC0MZ2ECbEERkFz8xcGfK2SWzUgh6KYvPjcRfKFxugpM7xOQK
 P/Wnqhs4fVRS/Mj/bCcXcO+yhwC121Q3qVeQVjGS0AzEC65hAW87a/kc2BfgcegD
 EyI9R7mf6criQwX+0awubjfoIdr4oW/8oDVNvUDczkJkbaEVaLMQk9P5x/0XnnVS
 UHUchWXyI80Q8Rj12uN1/I0h3WtwNQnCRBuLSmtm6GLfCAwbLvp2nGWDnaXiqryW
 DVKUIHGvqPKjkOOMOVfSvfB3LvkS3xsFVVYiQBQCn0YSs/gtu4CoF2Nty9CiLPbK
 tTuxUnLdPDZDxU//l0VArZmP8p2JM7XQGJ+JH8GFH4SBTyBR23e0iyPSoyaxjnYn
 RReDnHMVsrS1i7GPhbqDJWn+uqMSs7N149i0XmmyeqwQHUVSJN3J2BApP2nCaDfy
 H2lTuYly5FfEezt61NvCE4qr/VsWeEjm1fYlFQ9dFn4pGn+HghyCpw+xD1ZN56DN
 lujemau5B3kk1UTtAT4ypPqvuqjkRFqpNV2LzsJSk/Js+hApw8Y=
 =oY52
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Large effort by Eric to lower rtnl_lock pressure and remove locks:

      - Make commonly used parts of rtnetlink (address, route dumps
        etc) lockless, protected by RCU instead of rtnl_lock.

      - Add a netns exit callback which already holds rtnl_lock,
        allowing netns exit to take rtnl_lock once in the core instead
        of once for each driver / callback.

      - Remove locks / serialization in the socket diag interface.

      - Remove 6 calls to synchronize_rcu() while holding rtnl_lock.

      - Remove the dev_base_lock, depend on RCU where necessary.

   - Support busy polling on a per-epoll context basis. Poll length and
     budget parameters can be set independently of system defaults.

   - Introduce struct net_hotdata, to make sure read-mostly global
     config variables fit in as few cache lines as possible.

   - Add optional per-nexthop statistics to ease monitoring / debug of
     ECMP imbalance problems.

   - Support TCP_NOTSENT_LOWAT in MPTCP.

   - Ensure that IPv6 temporary addresses' preferred lifetimes are long
     enough, compared to other configured lifetimes, and at least 2 sec.

   - Support forwarding of ICMP Error messages in IPSec, per RFC 4301.

   - Add support for the independent control state machine for bonding
     per IEEE 802.1AX-2008 5.4.15 in addition to the existing coupled
     control state machine.

   - Add "network ID" to MCTP socket APIs to support hosts with multiple
     disjoint MCTP networks.

   - Re-use the mono_delivery_time skbuff bit for packets which user
     space wants to be sent at a specified time. Maintain the timing
     information while traversing veth links, bridge etc.

   - Take advantage of MSG_SPLICE_PAGES for RxRPC DATA and ACK packets.

   - Simplify many places iterating over netdevs by using an xarray
     instead of a hash table walk (hash table remains in place, for use
     on fastpaths).

   - Speed up scanning for expired routes by keeping a dedicated list.

   - Speed up "generic" XDP by trying harder to avoid large allocations.

   - Support attaching arbitrary metadata to netconsole messages.

  Things we sprinkled into general kernel code:

   - Enforce VM_IOREMAP flag and range in ioremap_page_range and
     introduce VM_SPARSE kind and vm_area_[un]map_pages (used by
     bpf_arena).

   - Rework selftest harness to enable the use of the full range of ksft
     exit code (pass, fail, skip, xfail, xpass).

  Netfilter:

   - Allow userspace to define a table that is exclusively owned by a
     daemon (via netlink socket aliveness) without auto-removing this
     table when the userspace program exits. Such table gets marked as
     orphaned and a restarting management daemon can re-attach/regain
     ownership.

   - Speed up element insertions to nftables' concatenated-ranges set
     type. Compact a few related data structures.

  BPF:

   - Add BPF token support for delegating a subset of BPF subsystem
     functionality from privileged system-wide daemons such as systemd
     through special mount options for userns-bound BPF fs to a trusted
     & unprivileged application.

   - Introduce bpf_arena which is sparse shared memory region between
     BPF program and user space where structures inside the arena can
     have pointers to other areas of the arena, and pointers work
     seamlessly for both user-space programs and BPF programs.

   - Introduce may_goto instruction that is a contract between the
     verifier and the program. The verifier allows the program to loop
     assuming it's behaving well, but reserves the right to terminate
     it.

   - Extend the BPF verifier to enable static subprog calls in spin lock
     critical sections.

   - Support registration of struct_ops types from modules which helps
     projects like fuse-bpf that seeks to implement a new struct_ops
     type.

   - Add support for retrieval of cookies for perf/kprobe multi links.

   - Support arbitrary TCP SYN cookie generation / validation in the TC
     layer with BPF to allow creating SYN flood handling in BPF
     firewalls.

   - Add code generation to inline the bpf_kptr_xchg() helper which
     improves performance when stashing/popping the allocated BPF
     objects.

  Wireless:

   - Add SPP (signaling and payload protected) AMSDU support.

   - Support wider bandwidth OFDMA, as required for EHT operation.

  Driver API:

   - Major overhaul of the Energy Efficient Ethernet internals to
     support new link modes (2.5GE, 5GE), share more code between
     drivers (especially those using phylib), and encourage more
     uniform behavior. Convert and clean up drivers.

   - Define an API for querying per netdev queue statistics from
     drivers.

   - IPSec: account in global stats for fully offloaded sessions.

   - Create a concept of Ethernet PHY Packages at the Device Tree level,
     to allow parameterizing the existing PHY package code.

   - Enable Rx hashing (RSS) on GTP protocol fields.

  Misc:

   - Improvements and refactoring all over networking selftests.

   - Create uniform module aliases for TC classifiers, actions, and
     packet schedulers to simplify creating modprobe policies.

   - Address all missing MODULE_DESCRIPTION() warnings in networking.

   - Extend the Netlink descriptions in YAML to cover message
     encapsulation or "Netlink polymorphism", where interpretation of
     nested attributes depends on link type, classifier type or some
     other "class type".

  Drivers:

   - Ethernet high-speed NICs:
      - Add a new driver for Marvell's Octeon PCI Endpoint NIC VF.
      - Intel (100G, ice, idpf):
         - support E825-C devices
      - nVidia/Mellanox:
         - support devices with one port and multiple PCIe links
      - Broadcom (bnxt):
         - support n-tuple filters
         - support configuring the RSS key
      - Wangxun (ngbe/txgbe):
         - implement irq_domain for TXGBE's sub-interrupts
      - Pensando/AMD:
         - support XDP
         - optimize queue submission and wakeup handling (+17% bps)
         - optimize struct layout, saving 28% of memory on queues

   - Ethernet NICs embedded and virtual:
      - Google cloud vNIC:
         - refactor driver to perform memory allocations for new queue
           config before stopping and freeing the old queue memory
      - Synopsys (stmmac):
         - obey queueMaxSDU and implement counters required by 802.1Qbv
      - Renesas (ravb):
         - support packet checksum offload
         - suspend to RAM and runtime PM support

   - Ethernet switches:
      - nVidia/Mellanox:
         - support for nexthop group statistics
      - Microchip:
         - ksz8: implement PHY loopback
         - add support for KSZ8567, a 7-port 10/100Mbps switch

   - PTP:
      - New driver for RENESAS FemtoClock3 Wireless clock generator.
      - Support OCP PTP cards designed and built by Adva.

   - CAN:
      - Support recvmsg() flags for own, local and remote traffic on CAN
        BCM sockets.
      - Support for esd GmbH PCIe/402 CAN device family.
      - m_can:
         - Rx/Tx submission coalescing
         - wake on frame Rx

   - WiFi:
      - Intel (iwlwifi):
         - enable signaling and payload protected A-MSDUs
         - support wider-bandwidth OFDMA
         - support for new devices
         - bump FW API to 89 for AX devices; 90 for BZ/SC devices
      - MediaTek (mt76):
         - mt7915: newer ADIE version support
         - mt7925: radio temperature sensor support
      - Qualcomm (ath11k):
         - support 6 GHz station power modes: Low Power Indoor (LPI),
           Standard Power) SP and Very Low Power (VLP)
         - QCA6390 & WCN6855: support 2 concurrent station interfaces
         - QCA2066 support
      - Qualcomm (ath12k):
         - refactoring in preparation for Multi-Link Operation (MLO)
           support
         - 1024 Block Ack window size support
         - firmware-2.bin support
         - support having multiple identical PCI devices (firmware needs
           to have ATH12K_FW_FEATURE_MULTI_QRTR_ID)
         - QCN9274: support split-PHY devices
         - WCN7850: enable Power Save Mode in station mode
         - WCN7850: P2P support
      - RealTek:
         - rtw88: support for more rtw8811cu and rtw8821cu devices
         - rtw89: support SCAN_RANDOM_SN and SET_SCAN_DWELL
         - rtlwifi: speed up USB firmware initialization
         - rtwl8xxxu:
             - RTL8188F: concurrent interface support
             - Channel Switch Announcement (CSA) support in AP mode
      - Broadcom (brcmfmac):
         - per-vendor feature support
         - per-vendor SAE password setup
         - DMI nvram filename quirk for ACEPC W5 Pro"

* tag 'net-next-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2255 commits)
  nexthop: Fix splat with CONFIG_DEBUG_PREEMPT=y
  nexthop: Fix out-of-bounds access during attribute validation
  nexthop: Only parse NHA_OP_FLAGS for dump messages that require it
  nexthop: Only parse NHA_OP_FLAGS for get messages that require it
  bpf: move sleepable flag from bpf_prog_aux to bpf_prog
  bpf: hardcode BPF_PROG_PACK_SIZE to 2MB * num_possible_nodes()
  selftests/bpf: Add kprobe multi triggering benchmarks
  ptp: Move from simple ida to xarray
  vxlan: Remove generic .ndo_get_stats64
  vxlan: Do not alloc tstats manually
  devlink: Add comments to use netlink gen tool
  nfp: flower: handle acti_netdevs allocation failure
  net/packet: Add getsockopt support for PACKET_COPY_THRESH
  net/netlink: Add getsockopt support for NETLINK_LISTEN_ALL_NSID
  selftests/bpf: Add bpf_arena_htab test.
  selftests/bpf: Add bpf_arena_list test.
  selftests/bpf: Add unit tests for bpf_arena_alloc/free_pages
  bpf: Add helper macro bpf_addr_space_cast()
  libbpf: Recognize __arena global variables.
  bpftool: Recognize arena map type
  ...
2024-03-12 17:44:08 -07:00