linux

iv/linux

Author	SHA1	Message	Date
Frederic Barrat	bdecf76e31	cxl: Fix coredump generation when cxl_get_fd() is used If a process dumps core while owning a cxl file descriptor obtained from an AFU driver (e.g. cxlflash) through the cxl_get_fd() API, the following error occurs: [ 868.027591] Unable to handle kernel paging request for data at address ... [ 868.027778] Faulting instruction address: 0xc00000000035edb0 cpu 0x8c: Vector: 300 (Data Access) at [c000003c688275e0] pc: c00000000035edb0: elf_core_dump+0xd60/0x1300 lr: c00000000035ed80: elf_core_dump+0xd30/0x1300 sp: c000003c68827860 msr: 9000000100009033 dar: c dsisr: 40000000 current = 0xc000003c68780000 paca = 0xc000000001b73200 softe: 0 irq_happened: 0x01 pid = 46725, comm = hxesurelock enter ? for help [c000003c68827a60] c00000000036948c do_coredump+0xcec/0x11e0 [c000003c68827c20] c0000000000ce9e0 get_signal+0x540/0x7b0 [c000003c68827d10] c000000000017354 do_signal+0x54/0x2b0 [c000003c68827e00] c00000000001777c do_notify_resume+0xbc/0xd0 [c000003c68827e30] c000000000009838 ret_from_except_lite+0x64/0x68 --- Exception: 300 (Data Access) at 00003fff98ad2918 The root cause is that the address_space structure for the file doesn't define a 'host' member. When cxl allocates a file descriptor, it's using the anonymous inode to back the file, but allocates a private address_space for each context. The private address_space allows to track memory allocation for each context. cxl doesn't define the 'host' member of the address space, i.e. the inode. We don't want to define it as the anonymous inode, since there's no longer a 1-to-1 relation between address_space and inode. To fix it, instead of using the anonymous inode, we introduce a simple pseudo filesystem so that cxl can allocate its own inodes. So we now have one inode for each file and address_space. The pseudo filesystem is only mounted on the first allocation of a file descriptor by cxl_get_fd(). Tested with cxlflash. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-11-18 23:02:17 +11:00
Vaibhav Jain	70b565bbdb	cxl: Prevent adapter reset if an active context exists This patch prevents resetting the cxl adapter via sysfs in presence of one or more active cxl_context on it. This protects against an unrecoverable error caused by PSL owning a dirty cache line even after reset and host tries to touch the same cache line. In case a force reset of the card is required irrespective of any active contexts, the int value -1 can be stored in the 'reset' sysfs attribute of the card. The patch introduces a new atomic_t member named contexts_num inside struct cxl that holds the number of active context attached to the card , which is checked against '0' before proceeding with the reset. To prevent against a race condition where a context is activated just after reset check is performed, the contexts_num is atomically set to '-1' after reset-check to indicate that no more contexts can be activated on the card anymore. Before activating a context we atomically test if contexts_num is non-negative and if so, increment its value by one. In case the value of contexts_num is negative then it indicates that the card is about to be reset and context activation is error-ed out at that point. Fixes: `62fa19d4b4` ("cxl: Add ability to reset the card") Cc: stable@vger.kernel.org # v4.0+ Acked-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-19 20:35:39 +11:00
Frederic Barrat	aaa2245ed8	cxl: Flush PSL cache before resetting the adapter If the capi link is going down while the PSL owns a dirty cache line, any access from the host for that data could lead to an Uncorrectable Error. So when resetting the capi adapter through sysfs, make sure the PSL cache is flushed. It won't help if there are any active Process Elements on the card, as the cache would likely get new dirty cache lines immediately, but if resetting an idle adapter, it should avoid any bad surprises from data left over from terminated Process Elements. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-10-04 16:16:42 +11:00
Andrew Donnellan	164793379a	cxl: Fix NULL dereference in cxl_context_init() on PowerVM guests Commit `f67a6722d6` ("cxl: Workaround PE=0 hardware limitation in Mellanox CX4") added a "min_pe" field to struct cxl_service_layer_ops, to allow us to work around a Mellanox CX-4 hardware limitation. When allocating the PE number in cxl_context_init(), we read from ctx->afu->adapter->native->sl_ops->min_pe to get the minimum PE number. Unsurprisingly, in a PowerVM guest ctx->afu->adapter->native is NULL, and guests don't have a cxl_service_layer_ops struct anywhere. Move min_pe from struct cxl_service_layer_ops to struct cxl so it's accessible in both native and PowerVM environments. For the Mellanox CX-4, set the min_pe value in set_sl_ops(). Fixes: `f67a6722d6` ("cxl: Workaround PE=0 hardware limitation in Mellanox CX4") Reported-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-08-09 16:52:02 +10:00
Ian Munsie	f67a6722d6	cxl: Workaround PE=0 hardware limitation in Mellanox CX4 The CX4 card cannot cope with a context with PE=0 due to a hardware limitation, resulting in: [ 34.166577] command failed, status limits exceeded(0x8), syndrome 0x5a7939 [ 34.166580] mlx5_core 0000:01:00.1: Failed allocating uar, aborting Since the kernel API allocates a default context very early during device init that will almost certainly get Process Element ID 0 there is no easy way for us to extend the API to allow the Mellanox to inform us of this limitation ahead of time. Instead, work around the issue by extending the XSL structure to include a minimum PE to allocate. Although the bug is not in the XSL, it is the easiest place to work around this limitation given that the CX4 is currently the only card that uses an XSL. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:28:07 +10:00
Ian Munsie	a2f67d5ee8	cxl: Add support for interrupts on the Mellanox CX4 The Mellanox CX4 in cxl mode uses a hybrid interrupt model, where interrupts are routed from the networking hardware to the XSL using the MSIX table, and from there will be transformed back into an MSIX interrupt using the cxl style interrupts (i.e. using IVTE entries and ranges to map a PE and AFU interrupt number to an MSIX address). We want to hide the implementation details of cxl interrupts as much as possible. To this end, we use a special version of the MSI setup & teardown routines in the PHB while in cxl mode to allocate the cxl interrupts and configure the IVTE entries in the process element. This function does not configure the MSIX table - the CX4 card uses a custom format in that table and it would not be appropriate to fill that out in generic code. The rest of the functionality is similar to the "Full MSI-X mode" described in the CAIA, and this could be easily extended to support other adapters that use that mode in the future. The interrupts will be associated with the default context. If the maximum number of interrupts per context has been limited (e.g. by the mlx5 driver), it will automatically allocate additional kernel contexts to associate extra interrupts as required. These contexts will be started using the same WED that was used to start the default context. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:27:08 +10:00
Ian Munsie	cbce0917e2	cxl: Add preliminary workaround for CX4 interrupt limitation The Mellanox CX4 has a hardware limitation where only 4 bits of the AFU interrupt number can be passed to the XSL when sending an interrupt, limiting it to only 15 interrupts per context (AFU interrupt number 0 is invalid). In order to overcome this, we will allocate additional contexts linked to the default context as extra address space for the extra interrupts - this will be implemented in the next patch. This patch adds the preliminary support to allow this, by way of adding a linked list in the context structure that we use to keep track of the contexts dedicated to interrupts, and an API to simultaneously iterate over the related context structures, AFU interrupt numbers and hardware interrupt numbers. The point of using a single API to iterate these is to hide some of the details of the iteration from external code, and to reduce the number of APIs that need to be exported via base.c to allow built in code to call. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:27:02 +10:00
Ian Munsie	a19bd79e31	cxl: Allow a default context to be associated with an external pci_dev The cxl kernel API has a concept of a default context associated with each PCI device under the virtual PHB. The Mellanox CX4 will also use the cxl kernel API, but it does not use a virtual PHB - rather, the AFU appears as a physical function as a peer to the networking functions. In order to allow the kernel API to work with those networking functions, we will need to associate a default context with them as well. To this end, refactor the corresponding code to do this in vphb.c and export it so that it can be called from the PHB code. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:26:52 +10:00
Ian Munsie	62ccf2d2ef	cxl: Move cxl_afu_get / cxl_afu_put to base The Mellanox CX4 uses a model where the AFU is one physical function of the device, and is used by other peer physical functions of the same device. This will require those other devices to grab a reference on the AFU when they are initialised to make sure that it does not go away during their lifetime. Move the AFU refcount functions to base.c so they can be called from the PHB code. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-14 20:26:36 +10:00
Philippe Bergheaud	6e0c50f9e8	cxl: Refine slice error debug messages The PSL Slice Error Register (PSL_SERR_An) reports implementation dependent AFU errors, in the form of a bitmap. The PSL_SERR_An register content is printed in the form of hex dump debug message. This patch decodes the PSL_ERR_An register contents, and prints a specific error message for each possible error bit. It also dumps the secondary registers AFU_ERR_An and PSL_DSISR_An, that may contain extra debug information. This patch also removes the large WARN message that used to report the cxl slice error interrupt, and replaces it by a short informative message, that draws attention to AFU implementation errors. Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-08 22:22:03 +10:00
Ian Munsie	5e7823c9bc	cxl: Fix bug where AFU disable operation had no effect The AFU disable operation has a bug where it will not clear the enable bit and therefore will have no effect. To date this has likely been masked by fact that we perform an AFU reset before the disable, which also has the effect of clearing the enable bit, making the following disable operation effectively a noop on most hardware. This patch modifies the afu_control function to take a parameter to clear from the AFU control register so that the disable operation can clear the appropriate bit. This bug was uncovered on the Mellanox CX4, which uses an XSL rather than a PSL. On the XSL the reset operation will not complete while the AFU is enabled, meaning the enable bit was still set at the start of the disable and as a result this bug was hit and the disable also timed out. Because of this difference in behaviour between the PSL and XSL, this patch now makes the reset dependent on the card using a PSL to avoid waiting for a timeout on the XSL. It is entirely possible that we may be able to drop the reset altogether if it turns out we only ever needed it due to this bug - however I am not willing to drop it without further regression testing and have added comments to the code explaining the background. This also fixes a small issue where the AFU_Cntl register was read outside of the lock that protects it. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-07-08 22:11:18 +10:00
Michael Neuling	ad42de859f	cxl: Add set and get private data to context struct This provides AFU drivers a means to associate private data with a cxl context. This is particularly intended for make the new callbacks for driver specific events easier for AFU drivers to use, as they can easily get back to any private data structures they may use. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-28 18:35:08 +10:00
Philippe Bergheaud	b810253bd9	cxl: Add mechanism for delivering AFU driver specific events This adds an afu_driver_ops structure with fetch_event() and event_delivered() callbacks. An AFU driver such as cxlflash can fill this out and associate it with a context to enable passing custom AFU specific events to userspace. This also adds a new kernel API function cxl_context_pending_events(), that the AFU driver can use to notify the cxl driver that new specific events are ready to be delivered, and wake up anyone waiting on the context wait queue. The current count of AFU driver specific events is stored in the field afu_driver_events of the context structure. The cxl driver checks the afu_driver_events count during poll, select, read, etc. calls to check if an AFU driver specific event is pending, and calls fetch_event() to obtain and deliver that event. This way, the cxl driver takes care of all the usual locking semantics around these calls and handles all the generic cxl events, so that the AFU driver only needs to worry about it's own events. fetch_event() return a struct cxl_event_afu_driver_reserved, allocated by the AFU driver, and filled in with the specific event information and size. Total event size (header + data) should not be greater than CXL_READ_MIN_SIZE (4K). Th cxl driver prepends an appropriate cxl event header, copies the event to userspace, and finally calls event_delivered() to return the status of the operation to the AFU driver. The event is identified by the context and cxl_event_afu_driver_reserved pointers. Since AFU drivers provide their own means for userspace to obtain the AFU file descriptor (i.e. cxlflash uses an ioctl on their scsi file descriptor to obtain the AFU file descriptor) and the generic cxl driver will never use this event, the ABI of the event is up to each individual AFU driver. Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-28 18:34:56 +10:00
Ian Munsie	b385c9e971	cxl: Add support for CAPP DMA mode This adds support for using CAPP DMA mode, which is required for XSL based cards such as the Mellanox CX4 to function. This is currently an RFC as it depends on the corresponding support to be merged into skiboot first, which was submitted here: http://patchwork.ozlabs.org/patch/625582/ In the event that the skiboot on the system does not have the above support, it will indicate as such in the kernel log and abort the init process. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 23:10:26 +10:00
Frederic Barrat	6d382616ac	cxl: Abstract the differences between the PSL and XSL The XSL (Translation Service Layer) is a stripped down version of the PSL (Power Service Layer) used in some cards such as the Mellanox CX4. Like the PSL, it implements the CAIA architecture, but has a number of differences, mostly in it's implementation dependent registers. This adds an ops structure to abstract these differences to bring initial support for XSL CAPI devices. The XSL does not implement the optional architected SERR register, however while it treats it as a reserved register and should work with no special treatment, attempting to access it will cause the XSL_FEC (First Error Capture) register to be filled out, preventing it from capturing any subsequent errors. Therefore, this patch also prevents the kernel from trying to set up the SERR register so that the FEC register may still be useful, and to save one interrupt. The XSL also uses a special DMA cxl mode, which uses a slightly different init sequence for the CAPP and PHB. The kernel support for this will be in a future patch once the corresponding support has been merged into skiboot. Co-authored-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 23:08:54 +10:00
Ian Munsie	292841b096	cxl: Update process element after allocating interrupts In the kernel API, it is possible to attempt to allocate AFU interrupts after already starting a context. Since the process element structure used by the hardware is only filled out at the time the context is started, it will not be updated with the interrupt numbers that have just been allocated and therefore AFU interrupts will not work unless they were allocated prior to starting the context. This can present some difficulties as each CAPI enabled PCI device in the kernel API has a default context, which may need to be started very early to enable translations, potentially before interrupts can easily be set up. This patch makes the API more flexible to allow interrupts to be allocated after a context has already been started and takes care of updating the PE structure used by the hardware and notifying it to discard any cached copy it may have. The update is currently performed via a terminate/remove/add sequence. This is necessary on some hardware such as the XSL that does not properly support the update LLCMD. Note that this is only supported on powernv at present - attempting to perform this ordering on PowerVM will raise a warning. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-06-16 23:08:49 +10:00
Linus Torvalds	c04a588029	powerpc updates for 4.7 Highlights: - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V - Live patching support for ppc64le (also merged via livepatching.git) Various cleanups & minor fixes from: - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin Rothberg, Vipin K Parashar. General: - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot - Fix branching to OOL handlers in relocatable kernel from Hari Bathini - Add support for userspace Power9 copy/paste from Chris Smart - Always use STRICT_MM_TYPECHECKS from Michael Ellerman - Add mask of possible MMU features from Michael Ellerman PCI: - Enable pass through of NVLink to guests from Alexey Kardashevskiy - Cleanups in preparation for powernv PCI hotplug from Gavin Shan - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G. Piccoli - Remove the dependency on EEH struct in DDW mechanism from Guilherme G. Piccoli selftests: - Test cp_abort during context switch from Chris Smart - Add several tests for transactional memory support from Rashmica Gupta perf: - Add support for sampling interrupt register state from Anju T - Add support for unwinding perf-stackdump from Chandan Kumar cxl: - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud - Allow initialization on timebase sync failures from Frederic Barrat - Increase timeout for detection of AFU mmio hang from Frederic Barrat - Handle num_of_processes larger than can fit in the SPA from Ian Munsie - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie - Check periodically the coherent platform function's state from Christophe Lombard Freescale: - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum workaround, and a kconfig dependency fix." -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXPsGzAAoJEFHr6jzI4aWAVoAP/iKdrDe0eYHlVAE9SqnbsiZs lgDxdsC8P3fsmP1G9o/HkKhC82zHl/La8Ztz8dtqa+LkSzbfliWP1ztJsI7GsBFo tyCKzWnX9Rwvd3meHu/o/SQ29TNLm/PbPyyRqpj5QPbJ8XCXkAXR7ZZZqjvcMsJW /AgIr7Cgf53tl9oZzzl/c7CnNHhMq+NBdA71vhWtUx+T97wfJEGyKW6HhZyHDbEU iAki7fu77ZpEqC/Fh9swf0dCGBJ+a132NoMVo0AdV7EQLznUYlQpQEqa+1PyHZOP /ArOzf2mDg6m3PfCo1eiB07v8PnVZ3llEUbVAJNg3GUxbE4SHrqq/kwm0iElm3p/ DvFxerCwdX9vmskJX4wDs+pSZRabXYj9XVMptsgFzA4joWrqqb7mBHqaort88YcY YSljEt1bHyXmiJ+dBya40qARsWUkCVN7ZgEzdxckq0KI3w7g2tqpqIbO2lClWT6t B3GpqQ4jp34+d1M14FB91fIGK7tMvOhSInE0Mv9+tPvRsepXqiiU/SwdAtRlr3m2 zs/K+4FYcVjJ3Rmpgc+tI38PbZxHe212I35YN6L1LP+4ZfAtzz0NyKdooTIBtkbO 19pX4WbBjKq8zK+YutrySncBIrbnI6VjW51vtRhgVKZliPFO/6zKagyU6FbxM+E5 udQES+t3F/9gvtxgxtDe =YvyQ -----END PGP SIGNATURE----- Merge tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: "Highlights: - Support for Power ISA 3.0 (Power9) Radix Tree MMU from Aneesh Kumar K.V - Live patching support for ppc64le (also merged via livepatching.git) Various cleanups & minor fixes from: - Aaro Koskinen, Alexey Kardashevskiy, Andrew Donnellan, Aneesh Kumar K.V, Chris Smart, Daniel Axtens, Frederic Barrat, Gavin Shan, Ian Munsie, Lennart Sorensen, Madhavan Srinivasan, Mahesh Salgaonkar, Markus Elfring, Michael Ellerman, Oliver O'Halloran, Paul Gortmaker, Paul Mackerras, Rashmica Gupta, Russell Currey, Suraj Jitindar Singh, Thiago Jung Bauermann, Valentin Rothberg, Vipin K Parashar. General: - Update LMB associativity index during DLPAR add/remove from Nathan Fontenot - Fix branching to OOL handlers in relocatable kernel from Hari Bathini - Add support for userspace Power9 copy/paste from Chris Smart - Always use STRICT_MM_TYPECHECKS from Michael Ellerman - Add mask of possible MMU features from Michael Ellerman PCI: - Enable pass through of NVLink to guests from Alexey Kardashevskiy - Cleanups in preparation for powernv PCI hotplug from Gavin Shan - Don't report error in eeh_pe_reset_and_recover() from Gavin Shan - Restore initial state in eeh_pe_reset_and_recover() from Gavin Shan - Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" from Guilherme G Piccoli - Remove the dependency on EEH struct in DDW mechanism from Guilherme G Piccoli selftests: - Test cp_abort during context switch from Chris Smart - Add several tests for transactional memory support from Rashmica Gupta perf: - Add support for sampling interrupt register state from Anju T - Add support for unwinding perf-stackdump from Chandan Kumar cxl: - Configure the PSL for two CAPI ports on POWER8NVL from Philippe Bergheaud - Allow initialization on timebase sync failures from Frederic Barrat - Increase timeout for detection of AFU mmio hang from Frederic Barrat - Handle num_of_processes larger than can fit in the SPA from Ian Munsie - Ensure PSL interrupt is configured for contexts with no AFU IRQs from Ian Munsie - Add kernel API to allow a context to operate with relocate disabled from Ian Munsie - Check periodically the coherent platform function's state from Christophe Lombard Freescale: - Updates from Scott: "Contains 86xx fixes, minor device tree fixes, an erratum workaround, and a kconfig dependency fix." * tag 'powerpc-4.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (192 commits) powerpc/86xx: Fix PCI interrupt map definition powerpc/86xx: Move pci1 definition to the include file powerpc/fsl: Fix build of the dtb embedded kernel images powerpc/fsl: Fix rcpm compatible string powerpc/fsl: Remove FSL_SOC dependency from FSL_LBC powerpc/fsl-pci: Add a workaround for PCI 5 errata powerpc/fsl: Fix SPI compatible on t208xrdb and t1040rdb powerpc/powernv/npu: Add PE to PHB's list powerpc/powernv: Fix insufficient memory allocation powerpc/iommu: Remove the dependency on EEH struct in DDW mechanism Revert "powerpc/eeh: Fix crash in eeh_add_device_early() on Cell" powerpc/eeh: Drop unnecessary label in eeh_pe_change_owner() powerpc/eeh: Ignore handlers in eeh_pe_reset_and_recover() powerpc/eeh: Restore initial state in eeh_pe_reset_and_recover() powerpc/eeh: Don't report error in eeh_pe_reset_and_recover() Revert "powerpc/powernv: Exclude root bus in pnv_pci_reset_secondary_bus()" powerpc/powernv/npu: Enable NVLink pass through powerpc/powernv/npu: Rework TCE Kill handling powerpc/powernv/npu: Add set/unset window helpers powerpc/powernv/ioda2: Export debug helper pe_level_printk() ...	2016-05-20 10:12:41 -07:00
Christophe Lombard	266eab8f32	cxl: Check periodically the coherent platform function's state In the PowerVM environment, the PHYP CoherentAccel component manages the state of the Coherent Accelerator Processor Interface adapter and virtualizes CAPI resources, handles CAPP, PSL, PSL Slice errors - and interrupts - and provides a new set of hcalls for the OS APIs to utilize Accelerator Function Unit (AFU). During the course of operation, a coherent platform function can encounter errors. Some possible reason for errors are: • Hardware recoverable and unrecoverable errors • Transient and over-threshold correctable errors PHYP implements its own state model for the coherent platform function. The state of the AFU is available through a hcall. The current implementation of the cxl driver, for the PowerVM environment, checks this state of the AFU only when an action is requested - open a device, ioctl command, memory map, attach/detach a process - from an external driver - cxlflash, libcxl. If an error is detected the cxl driver handles the error according the content of the Power Architecture Platform Requirements document. But in case of low-level troubles (or error injection), the PHYP component may reset the card and change the AFU state. The PHYP interface doesn't provide any way to be notified when that happens thus implies that the cxl driver: • cannot handle immediatly the state change of the AFU. • cannot notify other drivers (cxlflash, ...) The purpose of this patch is to wake up the cpu periodically to check the current state of each AFU and to see if we need to enter an error recovery path. Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:11 +10:00
Ian Munsie	7a0d85d313	cxl: Add kernel API to allow a context to operate with relocate disabled cxl devices typically access memory using an MMU in much the same way as the CPU, and each context includes a state register much like the MSR in the CPU. Like the CPU, the state register includes a bit to enable relocation, which we currently always enable. In some cases, it may be desirable to allow a device to access memory using real addresses instead of effective addresses, so this adds a new API, cxl_set_translation_mode, that can be used to disable relocation on a given kernel context. This can allow for the creation of a special privileged context that the device can use if it needs relocation disabled, and can use regular contexts at times when it needs relocation enabled. This interface is only available to users of the kernel API for obvious reasons, and will never be supported in a virtualised environment. This will be used by the upcoming cxl support in the mlx5 driver. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:10 +10:00
Ian Munsie	0e5b5ba17a	cxl: Remove duplicate #defines These defines are not used, but other equivalent definitions (CXL_SPA_SW_CMD_*) are used. Remove the unused defines. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-05-11 21:54:09 +10:00
Michael Neuling	2bc79ffcbb	cxl: Poll for outstanding IRQs when detaching a context When detaching contexts, we may still have interrupts in the system which are yet to be delivered to any CPU and be acked in the PSL. This can result in a subsequent unrelated process getting an spurious IRQ or an interrupt for a non-existent context. This polls the PSL to ensure that the PSL is clear of IRQs for the detached context, before removing the context from the idr. Signed-off-by: Michael Neuling <mikey@neuling.org> Tested-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Tested-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-27 12:04:48 +10:00
Frederic Barrat	e009a7e858	cxl: Allow initialization on timebase sync failures Failure to synchronize the PSL timebase currently prevents the initialization of the cxl card, thus rendering the card useless. This is too extreme for a feature which is rarely used, if at all. No hardware AFUs or software is currently using PSL timebase. This patch still tries to synchronize the PSL timebase when the card is initialized, but ignores the error if it can't. Instead, it reports a status via /sys. Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-04-22 21:45:44 +10:00
Vaibhav Jain	17eb3eef19	cxl: Ignore probes for virtual afu pci devices Add a check at the beginning of cxl_probe function to ignore virtual pci devices created for each afu registered. This fixes the the errors messages logged about missing CXL vsec, when cxl probe is unable to find necessary vsec entries in device pci config space. The error message logged are of the form : cxl-pci 0004:00:00.0: ABORTING: CXL VSEC not found! cxl-pci 0004:00:00.0: cxl_init_adapter failed: -19 Cc: Ian Munsie <imunsie@au1.ibm.com> Cc: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Reviewed-by: fbarrat@linux.vnet.ibm.com Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 23:40:03 +11:00
Christophe Lombard	0d400f77c1	cxl: Adapter failure handling Check the AFU state whenever an API is called. The hypervisor may issue a reset of the adapter when it detects a fault. When it happens, it launches an error recovery which will either move the AFU to a permanent failure state, or in the disabled state. If the AFU is found to be disabled, detach all existing contexts from it before issuing a AFU reset to re-enable it. Before detaching contexts, notify any kernel driver through the EEH callbacks of the AFU pci device. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 23:40:00 +11:00
Frederic Barrat	d601ea918b	cxl: Support the cxl kernel API from a guest Like on bare-metal, the cxl driver creates a virtual PHB and a pci device for the AFU. The configuration space of the device is mapped to the configuration record of the AFU. Reuse the code defined in afu_cr_read8\|16\|32() when reading the configuration space of the AFU device. Even though the (virtual) AFU device is a pci device, the adapter is not. So a driver using the cxl kernel API cannot read the VPD of the adapter through the usual PCI interface. Therefore, we add a call to the cxl kernel API: ssize_t cxl_read_adapter_vpd(struct pci_dev dev, void buf, size_t count); Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 23:40:00 +11:00
Christophe Lombard	594ff7d067	cxl: Support to flash a new image on the adapter from a guest The new flash.c file contains the logic to flash a new image on the adapter, through a hcall. It is an iterative process, with chunks of data of 1M at a time. There are also 2 phases: write and verify. The flash operation itself is driven from a user-land tool. Once flashing is successful, an rtas call is made to update the device tree with the new properties values for the adapter and the AFU(s) Add a new char device for the adapter, so that the flash tool can access the card, even if there is no valid AFU on it. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 23:39:56 +11:00
Christophe Lombard	4752876c71	cxl: sysfs support for guests Filter out a few adapter parameters which don't make sense in a guest. Document the changes. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 23:39:40 +11:00
Christophe Lombard	14baf4d9c7	cxl: Add guest-specific code The new of.c file contains code to parse the device tree to find out about cxl adapters and AFUs. guest.c implements the guest-specific callbacks for the backend API. The process element ID is not known until the context is attached, so we have to separate the context ID assigned by the cxl driver from the process element ID visible to the user applications. In bare-metal, the 2 IDs match. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> [mpe: Fix SMP=n build, fix PSERIES=n build, minor whitespace fixes] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 23:36:52 +11:00
Christophe Lombard	cbffa3a514	cxl: Separate bare-metal fields in adapter and AFU data structures Introduce sub-structures containing the bare-metal specific fields in the structures describing the adapter (struct cxl) and AFU (struct cxl_afu). Update all their references. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 13:05:54 +11:00
Christophe Lombard	444c4ba461	cxl: New hcalls to support cxl adapters The hypervisor calls provide an interface with a coherent platform facility and function. It matches version 0.16 of the 'PAPR changes' document. The following hcalls are supported: H_ATTACH_CA_PROCESS Attach a process element to a coherent platform function. H_DETACH_CA_PROCESS Detach a process element from a coherent platform function. H_CONTROL_CA_FUNCTION Allow the partition to manipulate or query certain coherent platform function behaviors. H_COLLECT_CA_INT_INFO Collect interrupt info about a coherent. platform function after an interrupt occurred H_CONTROL_CA_FAULTS Control the operation of a coherent platform function after a fault occurs. H_DOWNLOAD_CA_FACILITY Support for downloading a base adapter image to the coherent platform facility, and for validating the entire image after the download. H_CONTROL_CA_FACILITY Allow the partition to manipulate or query certain coherent platform facility behaviors. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 13:05:52 +11:00
Frederic Barrat	6d625ed9a7	cxl: Update cxl_irq() prototype The context parameter when calling cxl_irq() should be strongly typed. Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 13:05:49 +11:00
Frederic Barrat	ea2d1f95ef	cxl: Isolate a few bare-metal-specific calls A few functions are mostly common between bare-metal and guest and just need minor tuning. To avoid crowding the backend API, introduce a few 'if' based on the CPU being in HV mode. Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 13:05:47 +11:00
Frederic Barrat	2b04cf310b	cxl: Rename some bare-metal specific functions Rename a few functions, changing the 'cxl_' prefix to either 'cxl_pci_' or 'cxl_native_', to make clear that the implementation is bare-metal specific. Those functions will have an equivalent implementation for a guest in a later patch. Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 13:05:46 +11:00
Frederic Barrat	5be587b111	cxl: Introduce implementation-specific API The backend API (in cxl.h) lists some low-level functions whose implementation is different on bare-metal and in a guest. Each environment implements its own functions, and the common code uses them through function pointers, defined in cxl_backend_ops Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 13:05:43 +11:00
Frederic Barrat	d56d301b51	cxl: Move bare-metal specific code to specialized files Move a few functions around to better separate code specific to bare-metal environment from code which will be commonly used between guest and bare-metal. Code specific to bare-metal is meant to be in native.c or pci.c only. It's basically anything which touches the card p1 registers, some p2 registers not needed from a guest and the PCI interface. Co-authored-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 13:05:39 +11:00
Christophe Lombard	8633186209	cxl: Move common code away from bare-metal-specific files Move around some functions which will be accessed from the bare-metal and guest environments. Code in native.c and pci.c is meant to be bare-metal specific. Other files contain code which may be shared with guests. Co-authored-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Signed-off-by: Christophe Lombard <clombard@linux.vnet.ibm.com> Reviewed-by: Manoj Kumar <manoj@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-03-09 13:05:37 +11:00
Vaibhav Jain	7b8ad495d5	cxl: Fix DSI misses when the context owning task exits Presently when a user-space process issues CXL_IOCTL_START_WORK ioctl we store the pid of the current task_struct and use it to get pointer to the mm_struct of the process, while processing page or segment faults from the capi card. However this causes issues when the thread that had originally issued the start-work ioctl exits in which case the stored pid is no more valid and the cxl driver is unable to handle faults as the mm_struct corresponding to process is no more accessible. This patch fixes this issue by using the mm_struct of the next alive task in the thread group. This is done by iterating over all the tasks in the thread group starting from thread group leader and calling get_task_mm on each one of them. When a valid mm_struct is obtained the pid of the associated task is stored in the context replacing the exiting one for handling future faults. The patch introduces a new function named get_mem_context that checks if the current task pointed to by ctx->pid is dead? If yes it performs the steps described above. Also a new variable cxl_context.glpid is introduced which stores the pid of the thread group leader associated with the context owning task. Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Reported-by: Frank Haverkamp <HAVERKAM@de.ibm.com> Suggested-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Frederic Barrat <fbarrat@linux.vnet.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2016-01-05 16:28:25 +11:00
Vaibhav Jain	1b5df59e50	cxl: Fix possible idr warning when contexts are released An idr warning is reported when a context is release after the capi card is unbound from the cxl driver via sysfs. Below are the steps to reproduce: 1. Create multiple afu contexts in an user-space application using libcxl. 2. Unbind capi card from cxl using command of form echo <capi-card-pci-addr> > /sys/bus/pci/drivers/cxl-pci/unbind 3. Exit/kill the application owning afu contexts. After above steps a warning message is usually seen in the kernel logs of the form "idr_remove called for id=<context-id> which is not allocated." This is caused by the function cxl_release_afu which destroys the contexts_idr table. So when a context is release no entry for context pe is found in the contexts_idr table and idr code prints this warning. This patch fixes this issue by increasing & decreasing the ref-count on the afu device when a context is initialized or when its freed respectively. This prevents the afu from being released until all the afu contexts have been released. The patch introduces two new functions namely cxl_afu_get/put that manage the ref-count on the afu device. Also the patch removes code inside cxl_dev_context_init that increases ref on the afu device as its guaranteed to be alive during this function. Reported-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-11-24 14:21:27 +11:00
Andrew Donnellan	8dde152ea3	cxl: fix leak of IRQ names in cxl_free_afu_irqs() cxl_free_afu_irqs() doesn't free IRQ names when it releases an AFU's IRQ ranges. The userspace API equivalent in afu_release_irqs() calls afu_irq_name_free() to release the IRQ names. Call afu_irq_name_free() in cxl_free_afu_irqs() to release the IRQ names. Make afu_irq_name_free() non-static to allow this. Reported-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Fixes: `6f7f0b3df6` ("cxl: Add AFU virtual PHB and kernel API") Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Reviewed-by: Matthew R. Ochs <mrochs@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-10-01 11:49:32 +10:00
Philippe Bergheaud	390fd5929f	cxl: Set up and enable PSL Timebase This patch configures the PSL Timebase function and enables it, after the CAPP has been initialized by OPAL. Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Philippe Bergheaud <felix@linux.vnet.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-30 18:56:34 +10:00
Ian Munsie	55e07668fb	cxl: Fix force unmapping mmaps of contexts allocated through the kernel api The cxl user api uses the address_space associated with the file when we need to force unmap all cxl mmap regions (e.g. on eeh, driver detach, etc). Currently, contexts allocated through the kernel api do not do this and instead skip the mmap invalidation, potentially allowing them to poke at the hardware after such an event, which may cause all sorts of trouble. This patch allocates an address_space for cxl contexts allocated through the kernel api so that the same invalidate path will for these contexts as well. We don't use the anonymous inode's address_space, as doing so could invalidate any mmaps of completely unrelated drivers using anonymous file descriptors. This patch also introduces a kernelapi flag, so we know when freeing the context if the address_space was allocated by us and needs to be freed. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-30 18:47:26 +10:00
Ian Munsie	d9232a3da8	cxl: Add alternate MMIO error handling userspace programs using cxl currently have to use two strategies for dealing with MMIO errors simultaneously. They have to check every read for a return of all Fs in case the adapter has gone away and the kernel has not yet noticed, and they have to deal with SIGBUS in case the kernel has already noticed, invalidated the mapping and marked the context as failed. In order to simplify things, this patch adds an alternative approach where the kernel will return a page filled with Fs instead of delivering a SIGBUS. This allows userspace to only need to deal with one of these two error paths, and is intended for use in libraries that use cxl transparently and may not be able to safely install a signal handler. This approach will only work if certain constraints are met. Namely, if the application is both reading and writing to an address in the problem state area it cannot assume that a non-FF read is OK, as it may just be reading out a value it has previously written. Further - since only one page is used per context a write to a given offset would be visible when reading the same offset from a different page in the mapping (this only applies within a single context, not between contexts). An application could deal with this by e.g. making sure it also reads from a read-only offset after any reads to a read/write offset. Due to these constraints, this functionality must be explicitly requested by userspace when starting the context by passing in the CXL_START_WORK_ERR_FF flag. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-18 19:34:43 +10:00
Daniel Axtens	9e8df8a219	cxl: EEH support EEH (Enhanced Error Handling) allows a driver to recover from the temporary failure of an attached PCI card. Enable basic CXL support for EEH. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-14 21:32:08 +10:00
Daniel Axtens	13e68d8bd0	cxl: Allow the kernel to trust that an image won't change on PERST. Provide a kernel API and a sysfs entry which allow a user to specify that when a card is PERSTed, it's image will stay the same, allowing it to participate in EEH. cxl_reset is used to reflash the card. In that case, we cannot safely assert that the image will not change. Therefore, disallow cxl_reset if the flag is set. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-14 21:32:07 +10:00
Daniel Axtens	05155772f6	cxl: Allocate and release the SPA with the AFU Previously the SPA was allocated and freed upon entering and leaving AFU-directed mode. This causes some issues for error recovery - contexts hold a pointer inside the SPA, and they may persist after the AFU has been detached. We would ideally like to allocate the SPA when the AFU is allocated, and release it until the AFU is released. However, we don't know how big the SPA needs to be until we read the AFU descriptor. Therefore, restructure the code: - Allocate the SPA only once, on the first attach. - Release the SPA only when the entire AFU is being released (not detached). Guard the release with a NULL check, so we don't free if it was never allocated (e.g. dedicated mode) Acked-by: Cyril Bur <cyrilbur@gmail.com> Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-14 21:32:04 +10:00
Daniel Axtens	0b3f9c757c	cxl: Drop commands if the PCI channel is not in normal state If the PCI channel has gone down, don't attempt to poke the hardware. We need to guard every time cxl_whatever_(read\|write) is called. This is because a call to those functions will dereference an offset into an mmio register, and the mmio mappings get invalidated in the EEH teardown. Check in the read/write functions in the header. We give them the same semantics as usual PCI operations: - a write to a channel that is down is ignored. - a read from a channel that is down returns all fs. Also, we try to access the MMIO space of a vPHB device as part of the PCI disable path. Because that's a read that bypasses most of our usual checks, we handle it explicitly. As far as user visible warnings go: - Check link state in file ops, return -EIO if down. - Be reasonably quiet if there's an error in a teardown path, or when we already know the hardware is going down. - Throw a big WARN if someone tries to start a CXL operation while the card is down. This gives a useful stacktrace for debugging whatever is doing that. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-14 21:32:03 +10:00
Daniel Axtens	588b34be20	cxl: Convert MMIO read/write macros to inline functions We're about to make these more complex, so make them functions first. Signed-off-by: Daniel Axtens <dja@axtens.net> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-08-14 21:32:02 +10:00
Michael Neuling	6f7f0b3df6	cxl: Add AFU virtual PHB and kernel API This patch does two things. Firstly it presents the Accelerator Function Unit (AFUs) behind the POWER Service Layer (PSL) as PCI devices on a virtual PCI Host Bridge (vPHB). This in in addition to the PSL being a PCI device itself. As part of the Coherent Accelerator Interface Architecture (CAIA) AFUs can provide an AFU configuration. This AFU configuration recored is architected to be the same as a PCI config space. This patch sets discovers the AFU configuration records, provides AFU config space read/write functions to these configuration records. It then enumerates the PCI bus. It also hooks in PCI ops where appropriate. It also destroys the vPHB when the physical card is removed. Secondly, it add an in kernel API for AFU to use CXL. AFUs must present a driver that firstly binds as a PCI device. This PCI device can then be using to do CXL specific operations (that can't sit in the PCI ops) using this API. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-03 13:27:20 +10:00
Michael Neuling	0520336afe	cxl: Export file ops for use by API The cxl kernel API will allow drivers other than cxl to export a file descriptor which has the same userspace API. These file descriptors will be able to be used against libcxl. This exports those file ops for use by other drivers. Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-03 13:27:20 +10:00
Michael Neuling	ec249dd860	cxl: Move include file cxl.h -> cxl-base.h This moves the current include file from cxl.h -> cxl-base.h. This current include file is used only to pass information between the base driver that needs to be built into the kernel and the cxl module. This is to make way for a new include/misc/cxl.h which will contain just the kernel API for other driver to use Signed-off-by: Michael Neuling <mikey@neuling.org> Acked-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>	2015-06-03 13:27:19 +10:00

1 2

67 Commits