linux

iv/linux

Author	SHA1	Message	Date
Dan Williams	516b300c4c	cxl/memdev: Formalize endpoint port linkage Move the endpoint port that the cxl_mem driver establishes from drvdata to a first class attribute. This is in preparation for device-memory drivers reusing the CXL core for memory region management. Those drivers need a type-safe method to retrieve their CXL port linkage. Leave drvdata for private usage of the cxl_mem driver not external consumers of a 'struct cxl_memdev' object. Reviewed-by: Fan Ni <fan.ni@samsung.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679264292.3436160.3901392135863405807.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:33 -07:00
Dan Williams	f3c8a37a43	cxl/pci: Unconditionally unmask 256B Flit errors The current check for 256B Flit mode is incomplete and unnecessary. It is incomplete because it fails to consider the link speed, or check for CXL link capabilities. It is unnecessary because unconditionally unmasking 256B Flit errors is a nop when 256B Flit operation is not available. Remove this check in preparation for creating a cxl_probe_link() helper to centralize this detection. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679263124.3436160.6228910132469454346.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:09 -07:00
Dan Williams	8c897b366c	cxl/region: Manage decoder target_type at decoder-attach time Switch-level (mid-level) decoders between the platform root and an endpoint can dynamically switch modes between HDM-H and HDM-D[B] depending on which region they target. Use the region type to fixup each decoder that gets allocated to map the given region. Note that endpoint decoders are meant to determine the region type, so warn if those ever need to be fixed up, but since it is possible to continue do so. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679262543.3436160.13053831955768440312.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:09 -07:00
Dan Williams	cecbb5da92	cxl/hdm: Default CXL_DEVTYPE_DEVMEM decoders to CXL_DECODER_DEVMEM In preparation for device-memory region creation, arrange for decoders of CXL_DEVTYPE_DEVMEM memdevs to default to CXL_DECODER_DEVMEM for their target type. Revisit this if a device ever shows up that wants to offer mixed HDM-H (Host-Only Memory) and HDM-DB support, or an CXL_DEVTYPE_DEVMEM device that supports HDM-H. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679261945.3436160.11673393474107374595.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	5aa39a9165	cxl/port: Rename CXL_DECODER_{EXPANDER, ACCELERATOR} => {HOSTONLYMEM, DEVMEM} In preparation for support for HDM-D and HDM-DB configuration (device-memory, and device-memory with back-invalidate). Rename the current type designators to use HOSTONLYMEM and DEVMEM as a suffix. HDM-DB can be supported by devices that are not accelerators, so DEVMEM is a more generic term for that case. Fixup one location where this type value was open coded. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679261369.3436160.7042443847605280593.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	f6b8ab32e3	cxl/memdev: Make mailbox functionality optional In support of the Linux CXL core scaling for a wider set of CXL devices, allow for the creation of memdevs with some memory device capabilities disabled. Specifically, allow for CXL devices outside of those claiming to be compliant with the generic CXL memory device class code, like vendor specific Type-2/3 devices that host CXL.mem. This implies, allow for the creation of memdevs that only support component-registers, not necessarily memory-device-registers (like mailbox registers). A memdev derived from a CXL endpoint that does not support generic class code expectations is tagged "CXL_DEVTYPE_DEVMEM", while a memdev derived from a class-code compliant endpoint is tagged "CXL_DEVTYPE_CLASSMEM". The primary assumption of a CXL_DEVTYPE_DEVMEM memdev is that it optionally may not host a mailbox. Disable the command passthrough ioctl for memdevs that are not CXL_DEVTYPE_CLASSMEM, and return empty strings from memdev attributes associated with data retrieved via the class-device-standard IDENTIFY command. Note that empty strings were chosen over attribute visibility to maintain compatibility with shipping versions of cxl-cli that expect those attributes to always be present. Once cxl-cli has dropped that requirement this workaround can be deprecated. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679260782.3436160.7587293613945445365.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	59f8d15107	cxl/mbox: Move mailbox related driver state to its own data structure 'struct cxl_dev_state' makes too many assumptions about the capabilities of a CXL device. In particular it assumes a CXL device has a mailbox and all of the infrastructure and state that comes along with that. In preparation for supporting accelerator / Type-2 devices that may not have a mailbox and in general maintain a minimal core context structure, make mailbox functionality a super-set of 'struct cxl_dev_state' with 'struct cxl_memdev_state'. With this reorganization it allows for CXL devices that support HDM decoder mapping, but not other general-expander / Type-3 capabilities, to only enable that subset without the rest of the mailbox infrastructure coming along for the ride. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679260240.3436160.15520641540463704524.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	3fe7feb0f3	cxl: Remove leftover attribute documentation in 'struct cxl_dev_state' commit 14d788740774 ("cxl/mem: Consolidate CXL DVSEC Range enumeration in the core") ...removed @info from 'struct cxl_dev_state', but neglected to remove the corresponding kernel-doc entry. Complete the removal. Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Closes: http://lore.kernel.org/r/20230606121054.000069e1@Huawei.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679259703.3436160.12583306507362357946.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	c192e5432f	cxl: Fix kernel-doc warnings After Jonathan noticed [1] that 'struct cxl_dev_state' had a kernel-doc entry without a corresponding struct attribute I ran the kernel-doc script to see what else might be broken. Fix these warnings: drivers/cxl/cxlmem.h:199: warning: This comment starts with '/*', but isn't a kernel-doc comment. Refer Documentation/doc-guide/kernel-doc.rst Event Interrupt Policy drivers/cxl/cxlmem.h:224: warning: Function parameter or member 'buf' not described in 'cxl_event_state' drivers/cxl/cxlmem.h:224: warning: Function parameter or member 'log_lock' not described in 'cxl_event_state' Note that scripts/kernel-doc only finds missing kernel-doc entries. It does not warn on too many kernel-doc entries, i.e. it did not catch the fact that @info refers to a not present member. Link: http://lore.kernel.org/r/20230606121054.000069e1@Huawei.com [1] Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679259170.3436160.3686460404739136336.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	688baac109	cxl/regs: Clarify when a 'struct cxl_register_map' is input vs output The @map parameter to cxl_probe_X_registers() is filled in with the mapping parameters of the register block. The @map parameter to cxl_map_X_registers() only reads that information to perform the mapping. Mark @map const for cxl_map_X_registers() to clarify that it is only an input to those helpers. Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168679258103.3436160.4941603739448763855.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 14:31:08 -07:00
Dan Williams	adfe19738b	cxl/region: Fix state transitions after reset failure Jonathan reports that failed attempts to reset a region (teardown its HDM decoder configuration) mistakenly advance the state of the region to "not committed". Revert to the previous state of the region on reset failure so that the reset can be re-attempted. Reported-by: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Closes: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168696507968.3590522.14484000711718573626.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 13:45:40 -07:00
Dan Williams	2ab47045ac	cxl/region: Flag partially torn down regions as unusable cxl_region_decode_reset() walks all the decoders associated with a given region and disables them. Due to decoder ordering rules it is possible that a switch in the topology notices that a given decoder can not be shutdown before another region with a higher HPA is shutdown first. That can leave the region in a partially committed state. Capture that state in a new CXL_REGION_F_NEEDS_RESET flag and require that a successful cxl_region_decode_reset() attempt must be completed before cxl_region_probe() accepts the region. This is a corollary for the bug that Jonathan identified in "CXL/region : commit reset of out of order region appears to succeed." [1]. Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Link: http://lore.kernel.org/r/20230316171441.0000205b@Huawei.com [1] Fixes: 176baefb2eb5 ("cxl/hdm: Commit decoder state to hardware") Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168696507423.3590522.16254212607926684429.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 13:45:40 -07:00
Dan Williams	d1257d098a	cxl/region: Move cache invalidation before region teardown, and before setup Vikram raised a concern with the theoretical case of a CPU sending MemClnEvict to a device that is not prepared to receive. MemClnEvict is a message that is sent after a CPU has taken ownership of a cacheline from accelerator memory (HDM-DB). In the case of hotplug or HDM decoder reconfiguration it is possible that the CPU is holding old contents for a new device that has taken over the physical address range being cached by the CPU. To avoid this scenario, invalidate caches prior to tearing down an HDM decoder configuration. Now, this poses another problem that it is possible for something to speculate into that space while the decode configuration is still up, so to close that gap also invalidate prior to establish new contents behind a given physical address range. With this change the cache invalidation is now explicit and need not be checked in cxl_region_probe(), and that obviates the need for CXL_REGION_F_INCOHERENT. Cc: Jonathan Cameron <Jonathan.Cameron@Huawei.com> Fixes: d18bc74aced6 ("cxl/region: Manage CPU caches relative to DPA invalidation events") Reported-by: Vikram Sethi <vsethi@nvidia.com> Closes: http://lore.kernel.org/r/BYAPR12MB33364B5EB908BF7239BB996BBD53A@BYAPR12MB3336.namprd12.prod.outlook.com Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168696506886.3590522.4597053660991916591.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 13:45:40 -07:00
Robert Richter	5d2ffbe4b8	cxl/port: Store the downstream port's Component Register mappings in struct cxl_dport Same as for ports, also store the downstream port's Component Register mappings, use struct cxl_dport for that. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-16-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 12:22:53 -07:00
Robert Richter	19ab69a60e	cxl/port: Store the port's Component Register mappings in struct cxl_port CXL capabilities are stored in the Component Registers. To use them, the specific I/O ranges of the capabilities must be determined by probing the registers. For this, the whole Component Register range needs to be mapped temporarily to detect the offset and length of a capability range. In order to use more than one capability of a component (e.g. RAS and HDM) the Component Register are probed and its mappings created multiple times. This also causes overlapping I/O ranges as the whole Component Register range must be mapped again while a capability's I/O range is already mapped. Different capabilities cannot be setup at the same time. E.g. the RAS capability must be made available as soon as the PCI driver is bound, the HDM decoder is setup later during port enumeration. Moreover, during early setup it is still unknown if a certain capability is needed. A central capability setup is therefore not possible, capabilities must be individually enabled once needed during initialization. To avoid a duplicate register probe and overlapping I/O mappings, only probe the Component Registers one time and store the Component Register mapping in struct port. The stored mappings can be used later to iomap the capability register range when enabling the capability, which will be implemented in a follow-on patch. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-15-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 12:22:44 -07:00
Robert Richter	733b57f262	cxl/pci: Early setup RCH dport component registers from RCRB CXL RAS capabilities must be enabled and accessible as soon as the CXL endpoint is detected in the PCI hierarchy and bound to the cxl_pci driver. This needs to be independent of other modules such as cxl_port or cxl_mem. CXL RAS capabilities reside in the Component Registers. For an RCH this is determined by probing RCRB which is implemented very late once the CXL Memory Device is created. Change this by moving the RCRB probe to the cxl_pci driver. Do this by using a new introduced function cxl_pci_find_port() similar to cxl_mem_find_port() to determine the involved dport by the endpoint's PCI handle. Plug this into the existing cxl_pci_setup_regs() function to setup Component Registers. Probe the RCRB in case the Component Registers cannot be located through the CXL Register Locator capability. This unifies code and early sets up the Component Registers at the same time for both, VH and RCH mode. Only the cxl_pci driver is involved for this. This allows an early mapping of the CXL RAS capability registers. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-14-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:57:11 -07:00
Robert Richter	86917c560d	cxl/mem: Prepare for early RCH dport component register setup In order to move the RCH dport component register setup to cxl_pci the base address must be stored in CXL device state (cxlds) for both modes, RCH and VH. Store it in cxlds->component_reg_phys and use it for endpoint creation. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-13-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:57:02 -07:00
Robert Richter	f1d0525eff	cxl/regs: Remove early capability checks in Component Register setup When probing the Component Registers in function cxl_probe_regs() there are also checks for the existence of the HDM and RAS capabilities. The checks may fail for components that do not implement the HDM capability causing the Component Registers setup to fail too. Remove the checks for a generalized use of cxl_probe_regs() and check them directly before mapping the RAS or HDM capabilities. This allows it to setup other Component Registers esp. of an RCH Downstream Port, which will be implemented in a follow-on patch. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-12-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:51:36 -07:00
Robert Richter	d8bffff201	cxl/port: Remove Component Register base address from struct cxl_dport The Component Register base address @component_reg_phys is no longer used after the rework of the Component Register setup which now uses struct member @comp_map instead. Remove the base address. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-11-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:51:17 -07:00
Robert Richter	d02034b402	cxl/acpi: Directly bind the CEDT detected CHBCR to the Host Bridge's port During a Host Bridge's downstream port enumeration the CHBS entries in the CEDT table are parsed, its Component Register base address extracted and then stored in struct cxl_dport. The CHBS may contain either the RCRB (RCH mode) or the Host Bridge's Component Registers (CHBCR, VH mode). The RCRB further contains the CXL downstream port register base address, while in VH mode the CXL Downstream Switch Ports are visible in the PCI hierarchy and the DP's component regs are disovered using the CXL DVSEC register locator capability. The Component Registers derived from the CHBS for both modes are different and thus also must be treated differently. That is, in RCH mode, the component regs base should be bound to the dport, but in VH mode to the CXL host bridge's port object. The current implementation stores the CHBCR in addition in struct cxl_dport and copies it later from there to struct cxl_port. As a result, the dport contains the wrong Component Registers base address and, e.g. the RAS capability of a CXL Root Port cannot be detected. To fix the CHBCR binding, attach it directly to the Host Bridge's @cxl_port structure. Do this during port creation of the Host Bridge in add_host_bridge_uport(). Factor out CHBS parsing code in add_host_bridge_dport() and use it in both functions. Co-developed-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-10-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:41:45 -07:00
Robert Richter	f44c7b7ad9	cxl/acpi: Move add_host_bridge_uport() after cxl_get_chbs() Just moving code to reorder functions to later share cxl_get_chbs() with add_host_bridge_uport(). This makes changes in the next patch visible. No other changes at all. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-9-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:41:34 -07:00
Terry Bowman	d076bb8c4c	cxl/pci: Refactor component register discovery for reuse The endpoint implements component register setup code. Refactor it for reuse with RCRB, downstream port, and upstream port setup. Move PCI specifics from cxl_setup_regs() into cxl_pci_setup_regs(). Move cxl_setup_regs() into cxl/core/regs.c and export it. This also includes supporting static functions cxl_map_registerblock(), cxl_unmap_register_block() and cxl_probe_regs(). Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-8-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:39:38 -07:00
Robert Richter	573408049b	cxl/core/regs: Add @dev to cxl_register_map The corresponding device of a register mapping is used for devm operations and logging. For operations with struct cxl_register_map the device needs to be kept track separately. To simpify the involved function interfaces, add @dev to cxl_register_map. While at it also reorder function arguments of cxl_map_device_regs() and cxl_map_component_regs() to have the object @cxl_register_map first. As a result a bunch of functions are available to be used with a @cxl_register_map object. This patch is in preparation of reworking the component register setup code. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-7-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:37:58 -07:00
Dan Williams	7481653dee	cxl: Rename 'uport' to 'uport_dev' For symmetry with the recent rename of ->dport_dev for a 'struct cxl_dport', add the "_dev" suffix to the ->uport property of a 'struct cxl_port'. These devices represent the downstream-port-device and upstream-port-device respectively in the CXL/PCIe topology. Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-6-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:37:54 -07:00
Robert Richter	227db57459	cxl: Rename member @dport of struct cxl_dport to @dport_dev Reading code like dport->dport does not immediately suggest that this points to the corresponding device structure of the dport. Rename struct member @dport to @dport_dev. While at it, also rename @new argument of add_dport() to @dport. This better describes the variable as a dport (e.g. new->dport becomes to dport->dport_dev). Co-developed-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-5-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:37:49 -07:00
Dan Williams	0619337856	cxl/rch: Prepare for caching the MMIO mapped PCIe AER capability Prepare cxl_probe_rcrb() for retrieving more than just the component register block. The RCH AER handling code wants to get back to the AER capability that happens to be MMIO mapped rather then configuration cycles. Move RCRB specific downstream port data, like the RCRB base and the AER capability offset, into its own data structure ('struct cxl_rcrb_info') for cxl_probe_rcrb() to fill. Extend 'struct cxl_dport' to include a 'struct cxl_rcrb_info' attribute. This centralizes all RCRB scanning in one routine. Co-developed-by: Robert Richter <rrichter@amd.com> Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-4-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:35:26 -07:00
Robert Richter	eb4663b07e	cxl/acpi: Probe RCRB later during RCH downstream port creation The RCRB is extracted already during ACPI CEDT table parsing while the data of this is needed not earlier than dport creation. This implementation comes with drawbacks: During ACPI table scan there is already MMIO access including mapping and unmapping, but only ACPI data should be collected here. The collected data must be transferred through a couple of interfaces until it is finally consumed when creating the dport. This causes complex data structures and function interfaces. Additionally, RCRB parsing will be extended to also extract AER data, it would be much easier do this at a later point during port and dport creation when the data structures are available to hold that data. To simplify all that, probe the RCRB at a later point during RCH downstream port creation. Change ACPI table parser to only extract the base address of either the component registers or the RCRB. Parse and extract the RCRB in devm_cxl_add_rch_dport(). This is in preparation to centralize all RCRB scanning. Signed-off-by: Robert Richter <rrichter@amd.com> Signed-off-by: Terry Bowman <terry.bowman@amd.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230622205523.85375-2-terry.bowman@amd.com Co-developed-by: Dan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/20230622205523.85375-3-terry.bowman@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-06-25 11:35:20 -07:00
Jonathan Cameron	1ad3f701c3	cxl/pci: Find and register CXL PMU devices CXL PMU devices can be found from entries in the Register Locator DVSEC. Reviewed-by: Dan Williams <dan.j.williams@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/20230526095824.16336-4-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-30 11:20:35 -07:00
Jonathan Cameron	d717d7f3df	cxl: Add functions to get an instance of / count regblocks of a given type Until the recently release CXL 3.0 specification, there was only ever one instance of any given register block pointed to by the Register Block Locator DVSEC. Now, the specification allows for multiple CXL PMU instances, each with their own register block. To enable this add cxl_find_regblock_instance() that takes an index parameter and use that to implement cxl_count_regblock() and cxl_find_regblock(). Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230526095824.16336-3-Jonathan.Cameron@huawei.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-30 11:20:35 -07:00
Dave Jiang	793a539ac7	cxl: Explicitly initialize resources when media is not ready When media is not ready do not assume that the capacity information from the identify command is valid, i.e. ->total_bytes ->partition_align_bytes ->{volatile,persistent}_only_bytes. Explicitly zero out the capacity resources and exit early. Given zero-init of those fields this patch is functionally equivalent to the prior state, but it improves readability and robustness going forward. Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168506118166.3004974.13523455340007852589.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-26 13:34:39 -07:00
Davidlohr Bueso	ccadf1310f	cxl/mbox: Add background cmd handling machinery This adds support for handling background operations, as defined in the CXL 3.0 spec. Commands that can take too long (over ~2 seconds) can run in the background asynchronously (to the hardware). The driver will deal with such commands synchronously, blocking all other incoming commands for a specified period of time, allowing time-slicing the command such that the caller can send incremental requests to avoid monopolizing the driver/device. Any out of sync (timeout) between the driver and hardware is just disregarded as an invalid state until the next successful submission. Such timeouts are considered a rare occurrence, either a real device problem or a driver issue that needs to reduce the size of the background operation to fit the timeout. On devices where mbox interrupts are supported, this will still use a poller that will wakeup in the specified wait intervals. The irq handler will simply awake the blocked cmd, which is also safe vs a task that is either waking (timing out) or already awoken. Similarly any irq setup error during the probing falls back to polling, thus avoids unnecessarily erroring out. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230523170927.20685-5-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-23 12:55:12 -07:00
Davidlohr Bueso	9f7a320d16	cxl/pci: Introduce cxl_request_irq() Factor out common functionality/semantics for cxl shared interrupts into a new helper on top of devm_request_irq(). Suggested-by: Dan Williams <dan.j.williams@intel.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230523170927.20685-4-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-23 12:55:12 -07:00
Davidlohr Bueso	f279d0bc13	cxl/pci: Allocate irq vectors earlier during probe Move the cxl_alloc_irq_vectors() call further up in the probing in order to allow for mailbox interrupt usage. No change in semantics. Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/20230523170927.20685-3-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-23 12:55:12 -07:00
Robert Richter	a70fc4ed20	cxl/port: Fix NULL pointer access in devm_cxl_add_port() In devm_cxl_add_port() the port creation may fail and its associated pointer does not contain a valid address. During error message generation this invalid port address is used. Fix that wrong address access. Fixes: f3cd264c4ec1 ("cxl: Unify debug messages when calling devm_cxl_add_port()") Signed-off-by: Robert Richter <rrichter@amd.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230519215436.3394532-1-rrichter@amd.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-19 17:47:01 -07:00
Dave Jiang	e764f12208	cxl: Move cxl_await_media_ready() to before capacity info retrieval Move cxl_await_media_ready() to cxl_pci probe before driver starts issuing IDENTIFY and retrieving memory device information to ensure that the device is ready to provide the information. Allow cxl_pci_probe() to succeed even if media is not ready. Cache the media failure in cxlds and don't ask the device for any media information. The rationale for proceeding in the !media_ready case is to allow for mailbox operations to interrogate and/or remediate the device. After media is repaired then rebinding the cxl_pci driver is expected to restart the capacity scan. Suggested-by: Dan Williams <dan.j.williams@intel.com> Fixes: b39cb1052a5c ("cxl/mem: Register CXL memX devices") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168445310026.3251520.8124296540679268206.stgit@djiang5-mobl3 [djbw: fixup cxl_test] Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-18 16:43:45 -07:00
Dave Jiang	ce17ad0d54	cxl: Wait Memory_Info_Valid before access memory related info The Memory_Info_Valid bit (CXL 3.0 8.1.3.8.2) indicates that the CXL Range Size High and Size Low registers are valid. The bit must be set within 1 second of reset deassertion to the device. Check valid bit before we check the Memory_Active bit when waiting for cxl_await_media_ready() to ensure that the memory info is valid for consumption. Also ensures both DVSEC ranges 1 and 2 are ready if DVSEC Capability indicates they are both supported. Fixes: 523e594d9cc0 ("cxl/pci: Implement wait for media active") Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/168444687469.3134781.11033518965387297327.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-18 16:42:41 -07:00
Dan Williams	eb0764b822	cxl/port: Enable the HDM decoder capability for switch ports Derick noticed, when testing hot plug, that hot-add behaves nominally after a removal. However, if the hot-add is done without a prior removal, CXL.mem accesses fail. It turns out that the original implementation of the port driver and region programming wrongly assumed that platform-firmware always enables the host-bridge HDM decoder capability. Add support turning on switch-level HDM decoders in the case where platform-firmware has not. The implementation is careful to only arrange for the enable to be undone if the current instance of the driver was the one that did the enable. This is to interoperate with platform-firmware that may expect CXL.mem to remain active after the driver is shutdown. This comes at the cost of potentially not shutting down the enable on kexec flows, but it is mitigated by the fact that the related HDM decoders still need to be enabled on an individual basis. Cc: <stable@vger.kernel.org> Reported-by: Derick Marks <derick.w.marks@intel.com> Fixes: 54cdbf845cf7 ("cxl/port: Add a driver for 'struct cxl_port' objects") Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/168437998331.403037.15719879757678389217.stgit@dwillia2-xfh.jf.intel.com Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-18 13:18:49 -07:00
Dave Jiang	764d102ef9	cxl: Add missing return to cdat read error path Add a return to the error path when cxl_cdat_read_table() fails. Current code continues with the table pointer points to freed memory. Fixes: 7a877c923995 ("cxl/pci: Simplify CDAT retrieval error path") Signed-off-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Link: https://lore.kernel.org/r/168382793506.3510737.4792518576623749076.stgit@djiang5-mobl3 Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-05-13 00:20:06 -07:00
Linus Torvalds	7acc137211	cxl for v6.4 - Refactor the DOE infrastructure (Data Object Exchange PCI-config-cycle mailbox) to be a facility of the PCI core rather than the CXL core. This is foundational for upcoming support for PCI device-attestation and PCIe / CXL link encryption. - Add support for retrieving and injecting poison for CXL memory expanders. This enabling uses trace-events to convey CXL media error records to user tooling. It includes translation of device-local addresses (DPA) to system physical addresses (SPA) and their corresponding CXL region. - Fixes for decoder enumeration that missed v6.3-final - Miscellaneous fixups -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZE2nNwAKCRDfioYZHlFs Z5c2AQCTWebok6CD+HN01xnIx+CBWAUQe0QIGR40dT2P6/WGEgEA8wMae0w/FDlc lQDvSoIyPvy1hGO7Ppb0K2AT6jrQAgU= =blcC -----END PGP SIGNATURE----- Merge tag 'cxl-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull compute express link updates from Dan Williams: "DOE support is promoted from drivers/cxl/ to drivers/pci/ with Bjorn's blessing, and the CXL core continues to mature its media management capabilities with support for listing and injecting media errors. Some late fixes that missed v6.3-final are also included: - Refactor the DOE infrastructure (Data Object Exchange PCI-config-cycle mailbox) to be a facility of the PCI core rather than the CXL core. This is foundational for upcoming support for PCI device-attestation and PCIe / CXL link encryption. - Add support for retrieving and injecting poison for CXL memory expanders. This enabling uses trace-events to convey CXL media error records to user tooling. It includes translation of device-local addresses (DPA) to system physical addresses (SPA) and their corresponding CXL region. - Fixes for decoder enumeration that missed v6.3-final - Miscellaneous fixups" * tag 'cxl-for-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (38 commits) cxl/test: Add mock test for set_timestamp cxl/mbox: Update CMD_RC_TABLE tools/testing/cxl: Require CONFIG_DEBUG_FS tools/testing/cxl: Add a sysfs attr to test poison inject limits tools/testing/cxl: Use injected poison for get poison list tools/testing/cxl: Mock the Clear Poison mailbox command tools/testing/cxl: Mock the Inject Poison mailbox command cxl/mem: Add debugfs attributes for poison inject and clear cxl/memdev: Trace inject and clear poison as cxl_poison events cxl/memdev: Warn of poison inject or clear to a mapped region cxl/memdev: Add support for the Clear Poison mailbox command cxl/memdev: Add support for the Inject Poison mailbox command tools/testing/cxl: Mock support for Get Poison List cxl/trace: Add an HPA to cxl_poison trace events cxl/region: Provide region info to the cxl_poison trace event cxl/memdev: Add trigger_poison_list sysfs attribute cxl/trace: Add TRACE support for CXL media-error records cxl/mbox: Add GET_POISON_LIST mailbox command cxl/mbox: Initialize the poison state cxl/mbox: Restrict poison cmds to debugfs cxl_raw_allow_all ...	2023-04-30 11:51:51 -07:00
Linus Torvalds	556eb8b791	Driver core changes for 6.4-rc1 Here is the large set of driver core changes for 6.4-rc1. Once again, a busy development cycle, with lots of changes happening in the driver core in the quest to be able to move "struct bus" and "struct class" into read-only memory, a task now complete with these changes. This will make the future rust interactions with the driver core more "provably correct" as well as providing more obvious lifetime rules for all busses and classes in the kernel. The changes required for this did touch many individual classes and busses as many callbacks were changed to take const * parameters instead. All of these changes have been submitted to the various subsystem maintainers, giving them plenty of time to review, and most of them actually did so. Other than those changes, included in here are a small set of other things: - kobject logging improvements - cacheinfo improvements and updates - obligatory fw_devlink updates and fixes - documentation updates - device property cleanups and const * changes - firwmare loader dependency fixes. All of these have been in linux-next for a while with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZEp7Sw8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ykitQCfamUHpxGcKOAGuLXMotXNakTEsxgAoIquENm5 LEGadNS38k5fs+73UaxV =7K4B -----END PGP SIGNATURE----- Merge tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the large set of driver core changes for 6.4-rc1. Once again, a busy development cycle, with lots of changes happening in the driver core in the quest to be able to move "struct bus" and "struct class" into read-only memory, a task now complete with these changes. This will make the future rust interactions with the driver core more "provably correct" as well as providing more obvious lifetime rules for all busses and classes in the kernel. The changes required for this did touch many individual classes and busses as many callbacks were changed to take const * parameters instead. All of these changes have been submitted to the various subsystem maintainers, giving them plenty of time to review, and most of them actually did so. Other than those changes, included in here are a small set of other things: - kobject logging improvements - cacheinfo improvements and updates - obligatory fw_devlink updates and fixes - documentation updates - device property cleanups and const * changes - firwmare loader dependency fixes. All of these have been in linux-next for a while with no reported problems" * tag 'driver-core-6.4-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (120 commits) device property: make device_property functions take const device * driver core: update comments in device_rename() driver core: Don't require dynamic_debug for initcall_debug probe timing firmware_loader: rework crypto dependencies firmware_loader: Strip off \n from customized path zram: fix up permission for the hot_add sysfs file cacheinfo: Add use_arch[\|_cache]_info field/function arch_topology: Remove early cacheinfo error message if -ENOENT cacheinfo: Check cache properties are present in DT cacheinfo: Check sib_leaf in cache_leaves_are_shared() cacheinfo: Allow early level detection when DT/ACPI info is missing/broken cacheinfo: Add arm64 early level initializer implementation cacheinfo: Add arch specific early level initializer tty: make tty_class a static const structure driver core: class: remove struct class_interface * from callbacks driver core: class: mark the struct class in struct class_interface constant driver core: class: make class_register() take a const * driver core: class: mark class_release() as taking a const * driver core: remove incorrect comment for device_create* MIPS: vpe-cmp: remove module owner pointer from struct class usage. ...	2023-04-27 11:53:57 -07:00
Davidlohr Bueso	bfe58458fd	cxl/mbox: Update CMD_RC_TABLE As of CXL 3.0 there have some added return codes, update the driver accordingly. Signed-off-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20230307042655.6714-1-dave@stgolabs.net Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 12:10:26 -07:00
Dan Williams	ca899f4021	Merge branch 'for-6.3/cxl-autodetect-fixes' into for-6.4/cxl Pick up late v6.3 fixes for v6.4.	2023-04-23 12:10:13 -07:00
Dan Williams	856ef55e7e	Merge branch 'for-6.4/cxl-poison' into for-6.4/cxl Include the poison list and injection infrastructure from Alison for v6.4.	2023-04-23 12:09:56 -07:00
Alison Schofield	50d527f52c	cxl/mem: Add debugfs attributes for poison inject and clear Inject and Clear Poison commands are optionally supported by CXL memdev devices and are intended for use in debug environments only. Add debugfs attributes for user access. Documentation/ABI/testing/debugfs-cxl describes the usage. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/0c9ea8e671b8e58465d18722788b60d325c675c7.1681874357.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 12:08:39 -07:00
Alison Schofield	98b6926562	cxl/memdev: Trace inject and clear poison as cxl_poison events The cxl_poison trace event allows users to view the history of poison list reads. With the addition of inject and clear poison capabilities, users will expect similar tracing. Add trace types 'Inject' and 'Clear' to the cxl_poison trace_event and trace successful operations only. If the driver finds that the DPA being injected or cleared of poison is mapped in a region, that region info is included in the cxl_poison trace event. Region reconfigurations can make this extra info useless if the debug operations are not carefully managed. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/e20eb7c3029137b480ece671998c183da0477e2e.1681874357.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 12:08:39 -07:00
Alison Schofield	0a105ab28a	cxl/memdev: Warn of poison inject or clear to a mapped region Inject and clear poison capabilities and intended for debug usage only. In order to be useful in debug environments, the driver needs to allow inject and clear operations on DPAs mapped in regions. dev_warn_once() when either operation occurs. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Link: https://lore.kernel.org/r/f911ca5277c9d0f9757b72d7e6842871bfff4fa2.1681874357.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:46:22 -07:00
Alison Schofield	9690b07748	cxl/memdev: Add support for the Clear Poison mailbox command CXL devices optionally support the CLEAR POISON mailbox command. Add memdev driver support for clearing poison. Per the CXL Specification (3.0 8.2.9.8.4.3), after receiving a valid clear poison request, the device removes the address from the device's Poison List and writes 0 (zero) for 64 bytes starting at address. If the device cannot clear poison from the address, it returns a permanent media error and -ENXIO is returned to the user. Additionally, and per the spec also, it is not an error to clear poison of an address that is not poisoned. If the address is not contained in the device's dpa resource, or is not 64 byte aligned, the driver returns -EINVAL without sending the command to the device. Poison clearing is intended for debug only and will be exposed to userspace through debugfs. Restrict compilation to CONFIG_DEBUG_FS. Implementation note: Although the CXL specification defines the clear command to accept 64 bytes of 'write-data', this implementation always uses zeroes as write-data. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Link: https://lore.kernel.org/r/8682c30ec24bd9c45af5feccb04b02be51e58c0a.1681874357.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:46:22 -07:00
Alison Schofield	d2fbc48658	cxl/memdev: Add support for the Inject Poison mailbox command CXL devices optionally support the INJECT POISON mailbox command. Add memdev driver support for the mailbox command. Per the CXL Specification (3.0 8.2.9.8.4.2), after receiving a valid inject poison request, the device will return poison when the address is accessed through the CXL.mem driver. Injecting poison adds the address to the device's Poison List and the error source is set to Injected. In addition, the device adds a poison creation event to its internal Informational Event log, updates the Event Status register, and if configured, interrupts the host. Also, per the CXL Specification, it is not an error to inject poison into an address that already has poison present and no error is returned from the device. If the address is not contained in the device's dpa resource, or is not 64 byte aligned, return -EINVAL without issuing the mbox command. Poison injection is intended for debug only and will be exposed to userspace through debugfs. Restrict compilation to CONFIG_DEBUG_FS. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/241c64115e6bd2effed9c7a20b08b3908dd7be8f.1681874357.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:46:22 -07:00
Alison Schofield	28a3ae4ff6	cxl/trace: Add an HPA to cxl_poison trace events When a cxl_poison trace event is reported for a region, the poisoned Device Physical Address (DPA) can be translated to a Host Physical Address (HPA) for consumption by user space. Translate and add the resulting HPA to the cxl_poison trace event. Follow the device decode logic as defined in the CXL Spec 3.0 Section 8.2.4.19.13. If no region currently maps the poison, assign ULLONG_MAX to the cxl_poison event hpa field. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Link: https://lore.kernel.org/r/6d3cd726f9042a59902785b0a2cb3ddfb70e0219.1681838292.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:46:13 -07:00
Alison Schofield	f0832a5863	cxl/region: Provide region info to the cxl_poison trace event User space may need to know which region, if any, maps the poison address(es) logged in a cxl_poison trace event. Since the mapping of DPAs (device physical addresses) to a region can change, the kernel must provide this information at the time the poison list is read. The event informs user space that at event <timestamp> this <region> mapped to this <DPA>, which is poisoned. The cxl_poison trace event is already wired up to log the region name and uuid if it receives param 'struct cxl_region'. In order to provide that cxl_region, add another method for gathering poison - by committed endpoint decoder mappings. This method is only available with CONFIG_CXL_REGION and is only used if a region actually maps the memdev where poison is being read. After the region driver reads the poison list for all the mapped resources, poison is read for any remaining unmapped resources. The default method remains: read the poison by memdev resource. Signed-off-by: Alison Schofield <alison.schofield@intel.com> Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/438b01ccaa70592539e8eda4eb2b1d617ba03160.1681838292.git.alison.schofield@intel.com Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Dan Williams <dan.j.williams@intel.com>	2023-04-23 11:46:06 -07:00

... 2 3 4 5 6 ...

610 Commits