363 Commits

Author SHA1 Message Date
Dan Williams
f3e6b3ae9c acpi/ghes: Remove CXL CPER notifications
Initial tests with the CXL CPER implementation identified that error
reports were being duplicated in the log and the trace event [1].  Then
it was discovered that the notification handler took sleeping locks
while the GHES event handling runs in spin_lock_irqsave() context [2]

While the duplicate reporting was fixed in v6.8-rc4, the fix for the
sleeping-lock-vs-atomic collision would enjoy more time to settle and
gain some test cycles.  Given how late it is in the development cycle,
remove the CXL hookup for now and try again during the next merge
window.

Note that end result is that v6.8 does not emit CXL CPER payloads to the
kernel log, but this is in line with the CXL trend to move error
reporting to trace events instead of the kernel log.

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: http://lore.kernel.org/r/20240108165855.00002f5a@Huawei.com [1]
Closes: http://lore.kernel.org/r/b963c490-2c13-4b79-bbe7-34c6568423c7@moroto.mountain [2]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-02-20 22:50:52 -08:00
Ira Weiny
54ce1927eb cxl/cper: Fix errant CPER prints for CXL events
Jonathan reports that CXL CPER events dump an extra generic error
message.

	{1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1
	{1}[Hardware Error]: event severity: recoverable
	{1}[Hardware Error]:  Error 0, type: recoverable
	{1}[Hardware Error]:   section type: unknown, fbcd0a77-c260-417f-85a9-088b1621eba6
	{1}[Hardware Error]:   section length: 0x90
	{1}[Hardware Error]:   00000000: 00000090 00000007 00000000 0d938086 ................
	{1}[Hardware Error]:   00000010: 00100000 00000000 00040000 00000000 ................
	...

CXL events were rerouted though the CXL subsystem for additional
processing.  However, when that work was done it was missed that
cper_estatus_print_section() continued with a generic error message
which is confusing.

Teach CPER print code to ignore printing details of some section types.
Assign the CXL event GUIDs to this set to prevent confusing unknown
prints.

Reported-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2024-02-03 18:31:17 +01:00
Linus Torvalds
db5ccb9eb2 cxl for v6.8
- Add support for parsing the Coherent Device Attribute Table (CDAT)
 
 - Add support for calculating a platform CXL QoS class from CDAT data
 
 - Unify the tracing of EFI CXL Events with native CXL Events.
 
 - Add Get Timestamp support
 
 - Miscellaneous cleanups and fixups
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCZaHVvAAKCRDfioYZHlFs
 Z3sCAQDPHSsHmj845k4lvKbWjys3eh78MKKEFyTXLQgYhOlsGAEAigQY2ZiSum52
 nwdIgpOOADNt0Iq6yXuLsmn9xvY9bAU=
 =HjCl
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL (Compute Express Link) updates from Dan Williams:
 "The bulk of this update is support for enumerating the performance
  capabilities of CXL memory targets and connecting that to a platform
  CXL memory QoS class. Some follow-on work remains to hook up this data
  into core-mm policy, but that is saved for v6.9.

  The next significant update is unifying how CXL event records (things
  like background scrub errors) are processed between so called
  "firmware first" and native error record retrieval. The CXL driver
  handler that processes the record retrieved from the device mailbox is
  now the handler for that same record format coming from an EFI/ACPI
  notification source.

  This also contains miscellaneous feature updates, like Get Timestamp,
  and other fixups.

  Summary:

   - Add support for parsing the Coherent Device Attribute Table (CDAT)

   - Add support for calculating a platform CXL QoS class from CDAT data

   - Unify the tracing of EFI CXL Events with native CXL Events.

   - Add Get Timestamp support

   - Miscellaneous cleanups and fixups"

* tag 'cxl-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (41 commits)
  cxl/core: use sysfs_emit() for attr's _show()
  cxl/pci: Register for and process CPER events
  PCI: Introduce cleanup helpers for device reference counts and locks
  acpi/ghes: Process CXL Component Events
  cxl/events: Create a CXL event union
  cxl/events: Separate UUID from event structures
  cxl/events: Remove passing a UUID to known event traces
  cxl/events: Create common event UUID defines
  cxl/events: Promote CXL event structures to a core header
  cxl: Refactor to use __free() for cxl_root allocation in cxl_endpoint_port_probe()
  cxl: Refactor to use __free() for cxl_root allocation in cxl_find_nvdimm_bridge()
  cxl: Fix device reference leak in cxl_port_perf_data_calculate()
  cxl: Convert find_cxl_root() to return a 'struct cxl_root *'
  cxl: Introduce put_cxl_root() helper
  cxl/port: Fix missing target list lock
  cxl/port: Fix decoder initialization when nr_targets > interleave_ways
  cxl/region: fix x9 interleave typo
  cxl/trace: Pass UUID explicitly to event traces
  cxl/region: use %pap format to print resource_size_t
  cxl/region: Add dev_dbg() detail on failure to allocate HPA space
  ...
2024-01-18 16:22:43 -08:00
Ira Weiny
671a794c33 acpi/ghes: Process CXL Component Events
BIOS can configure memory devices as firmware first.  This will send CXL
events to the firmware instead of the OS.  The firmware can then send
these events to the OS via UEFI.

UEFI v2.10 section N.2.14 defines a Common Platform Error Record (CPER)
format for CXL Component Events.  The format is mostly the same as the
CXL Common Event Record Format.  The difference is the use of a GUID in
the Section Type rather than a UUID as part of the event itself.

Add GHES support to detect CXL CPER records and call a registered
callback with the event.

A notifier chain was considered for the callback but the complexity did
not justify the use case as only the CXL subsystem requires this event.
Enforce that only one callback can be registered at any time.

Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20231220-cxl-cper-v5-7-1bb8a4ca2c7a@intel.com
[djbw: fixup checkpatch errors]
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2024-01-09 15:41:23 -08:00
Shuai Xue
a70297d221 ACPI: APEI: set memory failure flags as MF_ACTION_REQUIRED on synchronous events
There are two major types of uncorrected recoverable (UCR) errors :

 - Synchronous error: The error is detected and raised at the point of
   the consumption in the execution flow, e.g. when a CPU tries to
   access a poisoned cache line. The CPU will take a synchronous error
   exception such as Synchronous External Abort (SEA) on Arm64 and
   Machine Check Exception (MCE) on X86. OS requires to take action (for
   example, offline failure page/kill failure thread) to recover this
   uncorrectable error.

 - Asynchronous error: The error is detected out of processor execution
   context, e.g. when an error is detected by a background scrubber.
   Some data in the memory are corrupted. But the data have not been
   consumed. OS is optional to take action to recover this uncorrectable
   error.

When APEI firmware first is enabled, a platform may describe one error
source for the handling of synchronous errors (e.g. MCE or SEA notification
), or for handling asynchronous errors (e.g. SCI or External Interrupt
notification). In other words, we can distinguish synchronous errors by
APEI notification. For synchronous errors, kernel will kill the current
process which accessing the poisoned page by sending SIGBUS with
BUS_MCEERR_AR. In addition, for asynchronous errors, kernel will notify the
process who owns the poisoned page by sending SIGBUS with BUS_MCEERR_AO in
early kill mode. However, the GHES driver always sets mf_flags to 0 so that
all synchronous errors are handled as asynchronous errors in memory failure.

To this end, set memory failure flags as MF_ACTION_REQUIRED on synchronous
events.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Tested-by: Ma Wupeng <mawupeng1@huawei.com>
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Xiaofei Tan <tanxiaofei@huawei.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-12-21 14:50:25 +01:00
Avadhut Naik
22fca621bd ACPI: APEI: EINJ: Add support for vendor defined error types
Vendor-Defined Error types are supported by the platform apart from
standard error types if bit 31 is set in the output of GET_ERROR_TYPE
Error Injection Action.[1] While the errors themselves and the length
of their associated "OEM Defined data structure" might vary between
vendors, the physical address of this structure can be computed through
vendor_extension and length fields of "SET_ERROR_TYPE_WITH_ADDRESS" and
"Vendor Error Type Extension" Structures respectively.[2][3]

Currently, however, the einj module only computes the physical address of
Vendor Error Type Extension Structure. Neither does it compute the physical
address of OEM Defined structure nor does it establish the memory mapping
required for injecting Vendor-defined errors. Consequently, userspace
tools have to establish the very mapping through /dev/mem, nopat kernel
parameter and system calls like mmap/munmap initially before injecting
Vendor-defined errors.

Circumvent the issue by computing the physical address of OEM Defined data
structure and establishing the required mapping with the structure. Create
a new file "oem_error", if the system supports Vendor-defined errors, to
export this mapping, through debugfs_create_blob(). Userspace tools can
then populate their respective OEM Defined structure instances and just
write to the file as part of injecting Vendor-defined Errors. Similarly,
the tools can also read from the file if the system firmware provides some
information through the OEM defined structure after error injection.

[1] ACPI specification 6.5, section 18.6.4
[2] ACPI specification 6.5, Table 18.31
[3] ACPI specification 6.5, Table 18.32

Suggested-by: Yazen Ghannam <yazen.ghannam@amd.com>
Signed-off-by: Avadhut Naik <Avadhut.Naik@amd.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-21 21:10:44 +01:00
Avadhut Naik
709f3cbd65 ACPI: APEI: EINJ: Refactor available_error_type_show()
OSPM can discover the error injection capabilities of the platform by
executing GET_ERROR_TYPE error injection action.[1] The action returns
a DWORD representing a bitmap of platform supported error injections.[2]

The available_error_type_show() function determines the bits set within
this DWORD and provides a verbose output, from einj_error_type_string
array, through /sys/kernel/debug/apei/einj/available_error_type file.

The function however, assumes one to one correspondence between an error's
position in the bitmap and its array entry offset. Consequently, some
errors like Vendor Defined Error Type fail this assumption and will
incorrectly be shown as not supported, even if their corresponding bit is
set in the bitmap and they have an entry in the array.

Navigate around the issue by converting einj_error_type_string into an
array of structures with a predetermined mask for all error types
corresponding to their bit position in the DWORD returned by GET_ERROR_TYPE
action. The same breaks the aforementioned assumption resulting in all
supported error types by a platform being outputted through the above
available_error_type file.

[1] ACPI specification 6.5, Table 18.25
[2] ACPI specification 6.5, Table 18.30

Suggested-by: Alexey Kardashevskiy <alexey.kardashevskiy@amd.com>
Signed-off-by: Avadhut Naik <Avadhut.Naik@amd.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-11-21 21:10:44 +01:00
Jeshua Smith
fac475aab7 ACPI: APEI: Use ERST timeout for slow devices
Slow devices such as flash may not meet the default 1ms timeout value,
so use the ERST max execution time value that they provide as the
timeout if it is larger.

Signed-off-by: Jeshua Smith <jeshuas@nvidia.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-10-24 20:50:17 +02:00
Shiju Jose
e2abc47a5a ACPI: APEI: Fix AER info corruption when error status data has multiple sections
ghes_handle_aer() passes AER data to the PCI core for logging and
recovery by calling aer_recover_queue() with a pointer to struct
aer_capability_regs.

The problem was that aer_recover_queue() queues the pointer directly
without copying the aer_capability_regs data.  The pointer was to
the ghes->estatus buffer, which could be reused before
aer_recover_work_func() reads the data.

To avoid this problem, allocate a new aer_capability_regs structure
from the ghes_estatus_pool, copy the AER data from the ghes->estatus
buffer into it, pass a pointer to the new struct to
aer_recover_queue(), and free it after aer_recover_work_func() has
processed it.

Reported-by: Bjorn Helgaas <helgaas@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com>
Signed-off-by: Shiju Jose <shiju.jose@huawei.com>
[ rjw: Subject edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-09-21 20:44:23 +02:00
Li Yang
9368aa1882 APEI: GHES: correctly return NULL for ghes_get_devices()
Since 315bada690e0 ("EDAC: Check for GHES preference in the
chipset-specific EDAC drivers"), vendor specific EDAC driver will not
probe correctly when CONFIG_ACPI_APEI_GHES is enabled but no GHES device
is present.  Make ghes_get_devices() return NULL when the GHES device
list is empty to fix the problem.

Fixes: 9057a3f7ac36 ("EDAC/ghes: Prepare to make ghes_edac a proper module")
Signed-off-by: Li Yang <leoyang.li@nxp.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-12 19:31:48 +02:00
Miaohe Lin
d38f6bcea8 ACPI: APEI: mark bert_disable as __initdata
It's only used inside the __init section. Mark it __initdata.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-12 19:23:25 +02:00
Miaohe Lin
4348270137 ACPI: APEI: GHES: Remove unused ghes_estatus_pool_size_request()
ghes_estatus_pool_size_request() is unused now, so remove it.

Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-06-05 19:13:14 +02:00
Arnd Bergmann
fd936fd8ac efi: fix missing prototype warnings
The cper.c file needs to include an extra header, and efi_zboot_entry
needs an extern declaration to avoid these 'make W=1' warnings:

drivers/firmware/efi/libstub/zboot.c:65:1: error: no previous prototype for 'efi_zboot_entry' [-Werror=missing-prototypes]
drivers/firmware/efi/efi.c:176:16: error: no previous prototype for 'efi_attr_is_visible' [-Werror=missing-prototypes]
drivers/firmware/efi/cper.c:626:6: error: no previous prototype for 'cper_estatus_print' [-Werror=missing-prototypes]
drivers/firmware/efi/cper.c:649:5: error: no previous prototype for 'cper_estatus_check_header' [-Werror=missing-prototypes]
drivers/firmware/efi/cper.c:662:5: error: no previous prototype for 'cper_estatus_check' [-Werror=missing-prototypes]

To make this easier, move the cper specific declarations to
include/linux/cper.h.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2023-05-25 09:26:19 +02:00
Shuai Xue
f1e65718ec ACPI: APEI: EINJ: warn on invalid argument when explicitly indicated by platform
OSPM executes an EXECUTE_OPERATION action to instruct the platform to begin
the injection operation, then executes a GET_COMMAND_STATUS action to
determine the status of the completed operation. The ACPI Specification
documented error codes[1] are:

	0 = Success (Linux #define EINJ_STATUS_SUCCESS)
	1 = Unknown failure (Linux #define EINJ_STATUS_FAIL)
	2 = Invalid Access (Linux #define EINJ_STATUS_INVAL)

The original code report -EBUSY for both "Unknown Failure" and "Invalid
Access" cases. Actually, firmware could do some platform dependent sanity
checks and returns different error codes, e.g. "Invalid Access" to indicate
to the user that the parameters they supplied cannot be used for injection.

To this end, fix to return -EINVAL in the __einj_error_inject() error
handling case instead of always -EBUSY, when explicitly indicated by the
platform in the status of the completed operation.

[1] ACPI Specification 6.5 18.6.1. Error Injection Table

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-27 20:46:08 +02:00
Tony Luck
fe6603cafa ACPI: APEI: EINJ: Add CXL error types
ACPI 6.5 added six new error types for CXL. See chapter 18
table 18.30.

Add strings for the new types so that Linux will list them in the
/sys/kernel/debug/apei/einj/available_error_types file.

It seems no other changes are needed. Linux already accepts
the CXL codes (on a BIOS that advertises them).

Signed-off-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-03-20 18:33:00 +01:00
Shuai Xue
53fc7e80f3 ACPI: APEI: EINJ: Limit error type to 32-bit width
The bit map of error types to inject is 32-bit width [1].

Add parameter check to reflect the fact.

[1] ACPI Specification 6.4, Section 18.6.4. Error Types

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2023-01-30 16:40:05 +01:00
Steven Rostedt (Google)
292a089d78 treewide: Convert del_timer*() to timer_shutdown*()
Due to several bugs caused by timers being re-armed after they are
shutdown and just before they are freed, a new state of timers was added
called "shutdown".  After a timer is set to this state, then it can no
longer be re-armed.

The following script was run to find all the trivial locations where
del_timer() or del_timer_sync() is called in the same function that the
object holding the timer is freed.  It also ignores any locations where
the timer->function is modified between the del_timer*() and the free(),
as that is not considered a "trivial" case.

This was created by using a coccinelle script and the following
commands:

    $ cat timer.cocci
    @@
    expression ptr, slab;
    identifier timer, rfield;
    @@
    (
    -       del_timer(&ptr->timer);
    +       timer_shutdown(&ptr->timer);
    |
    -       del_timer_sync(&ptr->timer);
    +       timer_shutdown_sync(&ptr->timer);
    )
      ... when strict
          when != ptr->timer
    (
            kfree_rcu(ptr, rfield);
    |
            kmem_cache_free(slab, ptr);
    |
            kfree(ptr);
    )

    $ spatch timer.cocci . > /tmp/t.patch
    $ patch -p1 < /tmp/t.patch

Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-12-25 13:38:09 -08:00
Linus Torvalds
7adcadb984 - Make ghes_edac a simple module like the rest of the EDAC drivers and
drop this forced built-in only configuration by disentangling it from
 GHES. Work by Jia He.
 
 - The usual small cleanups and improvements all over EDAC land
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOXPvoACgkQEsHwGGHe
 VUq98xAAmhz4u9e9pXG0Ixkx25ZtnZ+YxANeQ53Hsa2gWicbcoFgL2E30gi97c1y
 X9W361B2Q5dYq+J/YRUnEOXlI/KMWLxzNykSvVipUFNfxXZH+PijEAArz2V35/uE
 6ZISRLUYVYEtHEoUXbTogeyBmBUnIaJfYheZCluDQlWPggsDESP1qmE+FTg25OBs
 rDl5y+zUZYPxrWustNodVThPyhdMwGyYAUS6qYKCoNs9SNkAjGnrXoPc9j/U+cV+
 qMY2dNS3uKnCujKEssQhcHucyWgCEDvmEKWMH4ItryV2UBBjpNRoM6HDe7XFKwVJ
 riOKX8VDrpdSdlV1jbCx9KB47BUwFygOYsFdW7gIDJ1hb8usN4nSYQNDIlZKEIQG
 cHNpv2XGT+pCSvyc4Iv2Fgyvnp25XensSQwQAtk5Y4/lJL1yrgcPjMOkPmRS+mmH
 BclDWNbL+gwqkyWxgfoivDBOetLgwJYTr2ewBr6QbBtwLB8rL4BxXIdomcoFPuxi
 jAxixZnTbS+Xq5S7uYK4r6KbaHGcJtwolXMGjx13IHmPfvYtTTQzfRcrBlAtQ/pV
 BDLoygmDVlkhSVx6bi5V5QZ06rcWYR4cRsBQ54FnBGMr730ZljgFONOHFtUab28T
 C+YUOaeLEYEYI0cIkkyoSuiz6avB6YvQAiyEPM0EdHZrQFwhBBw=
 =DFp8
 -----END PGP SIGNATURE-----

Merge tag 'edac_updates_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull EDAC updates from Borislav Petkov:

 - Make ghes_edac a simple module like the rest of the EDAC drivers and
   drop the forced built-in only configuration by disentangling it from
   GHES (Jia He)

 - The usual small cleanups and improvements all over EDAC land

* tag 'edac_updates_for_6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  EDAC/i10nm: fix refcount leak in pci_get_dev_wrapper()
  EDAC/i5400: Fix typo in comment: vaious -> various
  EDAC/mc_sysfs: Increase legacy channel support to 12
  MAINTAINERS: Make Mauro EDAC reviewer
  MAINTAINERS: Make Manivannan Sadhasivam the maintainer of qcom_edac
  EDAC/igen6: Return the correct error type when not the MC owner
  apei/ghes: Use xchg_release() for updating new cache slot instead of cmpxchg()
  EDAC: Check for GHES preference in the chipset-specific EDAC drivers
  EDAC/ghes: Make ghes_edac a proper module
  EDAC/ghes: Prepare to make ghes_edac a proper module
  EDAC/ghes: Add a notifier for reporting memory errors
  efi/cper: Export several helpers for ghes_edac to use
  EDAC/i5000: Mark as BROKEN
2022-12-12 14:47:31 -08:00
Jay Lu
87386ee83d ACPI: APEI: EINJ: Refactor available_error_type_show()
Move error type descriptions into an array and loop over error types
to improve readability and maintainability.

Replace seq_printf() with seq_puts() as recommended by checkpatch.pl.

Signed-off-by: Jay Lu <jaylu102@amd.com>
Co-developed-by: Ben Cheatham <benjamin.cheatham@amd.com>
Signed-off-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-07 18:16:12 +01:00
Jay Lu
37ea969386 ACPI: APEI: EINJ: Fix formatting errors
Checkpatch reveals warnings and an error due to missing lines and
incorrect indentations. Add the missing lines after declarations and
fix the suspect indentations.

Signed-off-by: Jay Lu <jaylu102@amd.com>
Signed-off-by: Ben Cheatham <benjamin.cheatham@amd.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-07 18:16:12 +01:00
Christophe JAILLET
89da5c476d ACPI: APEI: Remove a useless include
This file does not use rcu, so there is no point in including
<linux/rculist.h>.

So just remove it.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-12-02 20:18:50 +01:00
Sudeep Holla
e78963f5c5 ACPI: APEI: Silence missing prototype warnings
Silence the following warnings when make W=1:

 | CC   drivers/acpi/apei/apei-base.c
 |      warning: no previous prototype for 'arch_apei_enable_cmcff' [-Wmissing-prototypes]
 |              int __weak arch_apei_enable_cmcff(struct acpi_hest_header *hest_hdr,
 |                         ^
 | CC   drivers/acpi/apei/apei-base.c
 |      warning: no previous prototype for 'arch_apei_report_mem_error' [-Wmissing-prototypes]
 |              void __weak arch_apei_report_mem_error(int sev,
 |                          ^

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-11-23 19:34:53 +01:00
Ard Biesheuvel
dd3fa54b2e apei/ghes: Use xchg_release() for updating new cache slot instead of cmpxchg()
Some documentation first, about how this machinery works:

It seems, the intent of the GHES error records cache is to collect
already reported errors - see the ghes_estatus_cached() checks. There's
even a sentence trying to say what this does:

  /*
   * GHES error status reporting throttle, to report more kinds of
   * errors, instead of just most frequently occurred errors.
   */

New elements are added to the cache this way:

  if (!ghes_estatus_cached(estatus)) {
          if (ghes_print_estatus(NULL, ghes->generic, estatus))
                  ghes_estatus_cache_add(ghes->generic, estatus);

The intent being, once this new error record is reported, it gets cached
so that it doesn't get reported for a while due to too many, same-type
error records getting reported in burst-like scenarios. I.e., new,
unreported error types can have a higher chance of getting reported.

Now, the loop in ghes_estatus_cache_add() is trying to pick out the
oldest element in there. Meaning, something which got reported already
but a long while ago, i.e., a LRU-type scheme.

And the cmpxchg() is there presumably to make sure when that selected
element slot_cache is removed, it really *is* that element that gets
removed and not one which replaced it in the meantime.

Now, ghes_estatus_cache_add() selects a slot, and either succeeds in
replacing its contents with a pointer to a newly cached item, or it just
gives up and frees the new item again, without attempting to select
another slot even if one might be available.

Since only inserting new items is being done here, the race can only
cause a failure if the selected slot was updated with another new item
concurrently, which means that it is arbitrary which of those two items
gets dropped.

And "dropped" here means, the item doesn't get added to the cache so
the next time it is seen, it'll get reported again and an insertion
attempt will be done again. Eventually, it'll get inserted and all those
times when the insertion fails, the item will get reported although the
cache is supposed to prevent that and "ratelimit" those repeated error
records. Not a big deal in any case.

This means the cmpxchg() and the special case are not necessary.
Therefore, just drop the existing item unconditionally.

Move the xchg_release() and call_rcu() out of rcu_read_lock/unlock
section since there is no actually dereferencing the pointer at all.

  [ bp:
    - Flesh out and summarize what was discussed on the thread now
      that that cache contraption is understood;
    - Touch up code style. ]

Co-developed-by: Jia He <justin.he@arm.com>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221010023559.69655-7-justin.he@arm.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-28 18:50:41 +02:00
Uwe Kleine-König
36006ccb6b ACPI: APEI: Drop unsetting driver data on remove
Since commit 0998d0631001 ("device-core: Ensure drvdata = NULL when no
driver is bound") the driver core cares for cleaning driver data, so
don't do it in the driver, too.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-28 18:38:56 +02:00
Ard Biesheuvel
0d2aa70b8f apei/ghes: Use xchg_release() for updating new cache slot instead of cmpxchg()
Some documentation first, about how this machinery works:

It seems, the intent of the GHES error records cache is to collect
already reported errors - see the ghes_estatus_cached() checks. There's
even a sentence trying to say what this does:

  /*
   * GHES error status reporting throttle, to report more kinds of
   * errors, instead of just most frequently occurred errors.
   */

New elements are added to the cache this way:

  if (!ghes_estatus_cached(estatus)) {
          if (ghes_print_estatus(NULL, ghes->generic, estatus))
                  ghes_estatus_cache_add(ghes->generic, estatus);

The intent being, once this new error record is reported, it gets cached
so that it doesn't get reported for a while due to too many, same-type
error records getting reported in burst-like scenarios. I.e., new,
unreported error types can have a higher chance of getting reported.

Now, the loop in ghes_estatus_cache_add() is trying to pick out the
oldest element in there. Meaning, something which got reported already
but a long while ago, i.e., a LRU-type scheme.

And the cmpxchg() is there presumably to make sure when that selected
element slot_cache is removed, it really *is* that element that gets
removed and not one which replaced it in the meantime.

Now, ghes_estatus_cache_add() selects a slot, and either succeeds in
replacing its contents with a pointer to a newly cached item, or it just
gives up and frees the new item again, without attempting to select
another slot even if one might be available.

Since only inserting new items is being done here, the race can only
cause a failure if the selected slot was updated with another new item
concurrently, which means that it is arbitrary which of those two items
gets dropped.

And "dropped" here means, the item doesn't get added to the cache so
the next time it is seen, it'll get reported again and an insertion
attempt will be done again. Eventually, it'll get inserted and all those
times when the insertion fails, the item will get reported although the
cache is supposed to prevent that and "ratelimit" those repeated error
records. Not a big deal in any case.

This means the cmpxchg() and the special case are not necessary.
Therefore, just drop the existing item unconditionally.

Move the xchg_release() and call_rcu() out of rcu_read_lock/unlock
section since there is no actually dereferencing the pointer at all.

  [ bp:
    - Flesh out and summarize what was discussed on the thread now
      that that cache contraption is understood;
    - Touch up code style. ]

Co-developed-by: Jia He <justin.he@arm.com>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221010023559.69655-7-justin.he@arm.com
2022-10-24 17:22:25 +02:00
Jia He
802e7f1dfe EDAC/ghes: Make ghes_edac a proper module
Commit

  dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in apci_init()")

introduced a bug leading to ghes_edac_register() to be invoked before
edac_init(). Because at that time the bus "edac" hadn't been even
registered, this created sysfs nodes as /devices/mc0 instead of
/sys/devices/system/edac/mc/mc0 on an Ampere eMag server.

Fix this by turning ghes_edac into a proper module.

The list of GHES devices returned is not protected from being modified
concurrently but it is pretty static as it gets created only during GHES
init and latter is not a module so...

  [ bp: Massage. ]

Fixes: dc4e8c07e9e2 ("ACPI: APEI: explicit init of HEST and GHES in apci_init()")
Co-developed-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221010023559.69655-5-justin.he@arm.com
2022-10-21 21:59:19 +02:00
Jia He
9057a3f7ac EDAC/ghes: Prepare to make ghes_edac a proper module
To make ghes_edac a proper module, prepare to decouple its dependencies
from GHES.

Move the ghes_edac.force_load parameter to ghes.c in order to
properly control whether ghes_edac should be force-loaded: In
ghes_edac_register() it is too late to set the module flag.

Introduce a helper ghes_get_devices(), which returns the list of GHES
devices which got probed when the platform-check passes on the system.

The previous force_load check is not needed in ghes_edac_unregister()
since it will be checked in the module's init function of ghes_edac
later.

  [ bp: Massage. ]

Suggested-by: Toshi Kani <toshi.kani@hpe.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221010023559.69655-4-justin.he@arm.com
2022-10-21 19:32:38 +02:00
Jia He
8e40612f61 EDAC/ghes: Add a notifier for reporting memory errors
In order to make it a proper module and disentangle it from facilities,
add a notifier for reporting memory errors. Use an atomic notifier
because calls sites like ghes_proc_in_irq() run in interrupt context.

  [ bp: Massage commit message. ]

Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221010023559.69655-3-justin.he@arm.com
2022-10-20 13:25:53 +02:00
Ashish Kalra
43d2748394 ACPI: APEI: Fix integer overflow in ghes_estatus_pool_init()
Change num_ghes from int to unsigned int, preventing an overflow
and causing subsequent vmalloc() to fail.

The overflow happens in ghes_estatus_pool_init() when calculating
len during execution of the statement below as both multiplication
operands here are signed int:

len += (num_ghes * GHES_ESOURCE_PREALLOC_MAX_SIZE);

The following call trace is observed because of this bug:

[    9.317108] swapper/0: vmalloc error: size 18446744071562596352, exceeds total pages, mode:0xcc0(GFP_KERNEL), nodemask=(null),cpuset=/,mems_allowed=0-1
[    9.317131] Call Trace:
[    9.317134]  <TASK>
[    9.317137]  dump_stack_lvl+0x49/0x5f
[    9.317145]  dump_stack+0x10/0x12
[    9.317146]  warn_alloc.cold+0x7b/0xdf
[    9.317150]  ? __device_attach+0x16a/0x1b0
[    9.317155]  __vmalloc_node_range+0x702/0x740
[    9.317160]  ? device_add+0x17f/0x920
[    9.317164]  ? dev_set_name+0x53/0x70
[    9.317166]  ? platform_device_add+0xf9/0x240
[    9.317168]  __vmalloc_node+0x49/0x50
[    9.317170]  ? ghes_estatus_pool_init+0x43/0xa0
[    9.317176]  vmalloc+0x21/0x30
[    9.317177]  ghes_estatus_pool_init+0x43/0xa0
[    9.317179]  acpi_hest_init+0x129/0x19c
[    9.317185]  acpi_init+0x434/0x4a4
[    9.317188]  ? acpi_sleep_proc_init+0x2a/0x2a
[    9.317190]  do_one_initcall+0x48/0x200
[    9.317195]  kernel_init_freeable+0x221/0x284
[    9.317200]  ? rest_init+0xe0/0xe0
[    9.317204]  kernel_init+0x1a/0x130
[    9.317205]  ret_from_fork+0x22/0x30
[    9.317208]  </TASK>

Signed-off-by: Ashish Kalra <ashish.kalra@amd.com>
[ rjw: Subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-13 20:40:09 +02:00
Shuai Xue
415fed694f ACPI: APEI: do not add task_work to kernel thread to avoid memory leak
If an error is detected as a result of user-space process accessing a
corrupt memory location, the CPU may take an abort. Then the platform
firmware reports kernel via NMI like notifications, e.g. NOTIFY_SEA,
NOTIFY_SOFTWARE_DELEGATED, etc.

For NMI like notifications, commit 7f17b4a121d0 ("ACPI: APEI: Kick the
memory_failure() queue for synchronous errors") keep track of whether
memory_failure() work was queued, and make task_work pending to flush out
the queue so that the work is processed before return to user-space.

The code use init_mm to check whether the error occurs in user space:

    if (current->mm != &init_mm)

The condition is always true, becase _nobody_ ever has "init_mm" as a real
VM any more.

In addition to abort, errors can also be signaled as asynchronous
exceptions, such as interrupt and SError. In such case, the interrupted
current process could be any kind of thread. When a kernel thread is
interrupted, the work ghes_kick_task_work deferred to task_work will never
be processed because entry_handler returns to call ret_to_kernel() instead
of ret_to_user(). Consequently, the estatus_node alloced from
ghes_estatus_pool in ghes_in_nmi_queue_one_entry() will not be freed.
After around 200 allocations in our platform, the ghes_estatus_pool will
run of memory and ghes_in_nmi_queue_one_entry() returns ENOMEM. As a
result, the event failed to be processed.

    sdei: event 805 on CPU 113 failed with error: -2

Finally, a lot of unhandled events may cause platform firmware to exceed
some threshold and reboot.

The condition should generally just do

    if (current->mm)

as described in active_mm.rst documentation.

Then if an asynchronous error is detected when a kernel thread is running,
(e.g. when detected by a background scrubber), do not add task_work to it
as the original patch intends to do.

Fixes: 7f17b4a121d0 ("ACPI: APEI: Kick the memory_failure() queue for synchronous errors")
Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-10-04 16:04:41 +02:00
ye xingchen
382c5fec89 ACPI: APEI: Remove unneeded result variables
Return the erst_get_record_id_begin() and apei_exec_write_register()
return values directly instead of storing them in redundant local
variables.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
[ rjw: Changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-09-24 18:50:42 +02:00
Dmitry Monakhov
3709642896 ACPI: APEI: Add BERT error log footer
Print total number of records found during BERT log parsing.
This also simplify dmesg parser implementation for BERT events.

Signed-off-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru>
Acked-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-09-03 20:25:39 +02:00
Dan Williams
b13a3e5fd4 ACPI: APEI: Fix _EINJ vs EFI_MEMORY_SP
When a platform marks a memory range as "special purpose" it is not
onlined as System RAM by default. However, it is still suitable for
error injection. Add IORES_DESC_SOFT_RESERVED to einj_error_inject() as
a permissible memory type in the sanity checking of the arguments to
_EINJ.

Fixes: 262b45ae3ab4 ("x86/efi: EFI soft reservation to E820 enumeration")
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reported-by: Omar Avelar <omar.avelar@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-29 20:24:19 +02:00
Tony Luck
c3481b6b75 ACPI: APEI: Better fix to avoid spamming the console with old error logs
The fix in commit 3f8dec116210 ("ACPI/APEI: Limit printable size of BERT
table data") does not work as intended on systems where the BIOS has a
fixed size block of memory for the BERT table, relying on s/w to quit
when it finds a record with estatus->block_status == 0. On these systems
all errors are suppressed because the check:

	if (region_len < ACPI_BERT_PRINT_MAX_LEN)

always fails.

New scheme skips individual CPER records that are too large, and also
limits the total number of records that will be printed to 5.

Fixes: 3f8dec116210 ("ACPI/APEI: Limit printable size of BERT table data")
Cc: All applicable <stable@vger.kernel.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-29 19:55:11 +02:00
Xiang wangx
55b350529e ACPI: APEI: Fix double word in a comment
Delete the redundant word 'the'.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
[ rjw: New subject ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-06-09 20:46:47 +02:00
Tony Luck
ab59c89396 ACPI, APEI, EINJ: Refuse to inject into the zero page
Some validation tests dynamically inject errors into memory used by
applications to check that the system can recover from a variety of
poison consumption sceenarios.

But sometimes the virtual address picked by these tests is mapped to
the zero page.

This causes additional unexpected machine checks as other processes that
map the zero page also consume the poison.

Disallow injection to the zero page.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-22 16:52:27 +02:00
Liu Xinpeng
a090931524 ACPI: APEI: Fix missing ERST record id
Read a record is cleared by others, but the deleted record cache entry is
still created by erst_get_record_id_next. When next enumerate the records,
get the cached deleted record, then erst_read() return -ENOENT and try to
get next record, loop back to first ID will return 0 in function
__erst_record_id_cache_add_one and then set record_id as
APEI_ERST_INVALID_RECORD_ID, finished this time read operation.
It will result in read the records just in the cache hereafter.

This patch cleared the deleted record cache, fix the issue that
"./erst-inject -p" shows record counts not equal to "./erst-inject -n".

A reproducer of the problem(retry many times):

[root@localhost erst-inject]# ./erst-inject -c 0xaaaaa00011
[root@localhost erst-inject]# ./erst-inject -p
rc: 273
rcd sig: CPER
rcd id: 0xaaaaa00012
rc: 273
rcd sig: CPER
rcd id: 0xaaaaa00013
rc: 273
rcd sig: CPER
rcd id: 0xaaaaa00014
[root@localhost erst-inject]# ./erst-inject -i 0xaaaaa000006
[root@localhost erst-inject]# ./erst-inject -i 0xaaaaa000007
[root@localhost erst-inject]# ./erst-inject -i 0xaaaaa000008
[root@localhost erst-inject]# ./erst-inject -p
rc: 273
rcd sig: CPER
rcd id: 0xaaaaa00012
rc: 273
rcd sig: CPER
rcd id: 0xaaaaa00013
rc: 273
rcd sig: CPER
rcd id: 0xaaaaa00014
[root@localhost erst-inject]# ./erst-inject -n
total error record count: 6

Signed-off-by: Liu Xinpeng <liuxp11@chinatelecom.cn>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-13 20:29:24 +02:00
Jakob Koschel
fa34165096 ACPI, APEI: Use the correct variable for sizeof()
While the original code is valid, it is not the obvious choice for the
sizeof() call and in preparation to limit the scope of the list iterator
variable the sizeof should be changed to the size of the variable
being allocated.

Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-22 18:52:21 +01:00
Darren Hart
3f8dec1162 ACPI/APEI: Limit printable size of BERT table data
Platforms with large BERT table data can trigger soft lockup errors
while attempting to print the entire BERT table data to the console at
boot:

  watchdog: BUG: soft lockup - CPU#160 stuck for 23s! [swapper/0:1]

Observed on Ampere Altra systems with a single BERT record of ~250KB.

The original bert driver appears to have assumed relatively small table
data. Since it is impractical to reassemble large table data from
interwoven console messages, and the table data is available in

  /sys/firmware/acpi/tables/data/BERT

limit the size for tables printed to the console to 1024 (for no reason
other than it seemed like a good place to kick off the discussion, would
appreciate feedback from existing users in terms of what size would
maintain their current usage model).

Alternatively, we could make printing a CONFIG option, use the
bert_disable boot arg (or something similar), or use a debug log level.
However, all those solutions require extra steps or change the existing
behavior for small table data. Limiting the size preserves existing
behavior on existing platforms with small table data, and eliminates the
soft lockups for platforms with large table data, while still making it
available.

Signed-off-by: Darren Hart <darren@os.amperecomputing.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-09 19:40:13 +01:00
Randy Dunlap
f3303ff649 ACPI: APEI: fix return value of __setup handlers
__setup() handlers should return 1 to indicate that the boot option
has been handled. Returning 0 causes a boot option to be listed in
the Unknown kernel command line parameters and also added to init's
arg list (if no '=' sign) or environment list (if of the form 'a=b').

Unknown kernel command line parameters "erst_disable
  bert_disable hest_disable BOOT_IMAGE=/boot/bzImage-517rc6", will be
  passed to user space.

 Run /sbin/init as init process
   with arguments:
     /sbin/init
     erst_disable
     bert_disable
     hest_disable
   with environment:
     HOME=/
     TERM=linux
     BOOT_IMAGE=/boot/bzImage-517rc6

Fixes: a3e2acc5e37b ("ACPI / APEI: Add Boot Error Record Table (BERT) support")
Fixes: a08f82d08053 ("ACPI, APEI, Error Record Serialization Table (ERST) support")
Fixes: 9dc966641677 ("ACPI, APEI, HEST table parsing")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Igor Zhbanov <i.zhbanov@omprussia.ru>
Link: lore.kernel.org/r/64644a2f-4a20-bab3-1e15-3b2cdd0defe3@omprussia.ru
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-08 19:43:39 +01:00
Shuai Xue
27e932a314 ACPI: APEI: rename ghes_init() with an "acpi_" prefix
ghes_init() sticks out in acpi_init() because it is the only functions
without an "acpi_" prefix.

Rename ghes_init with an "acpi_" prefix, then all looks fine.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-03 20:25:23 +01:00
Shuai Xue
dc4e8c07e9 ACPI: APEI: explicit init of HEST and GHES in apci_init()
From commit e147133a42cb ("ACPI / APEI: Make hest.c manage the estatus
memory pool") was merged, ghes_init() relies on acpi_hest_init() to manage
the estatus memory pool. On the other hand, ghes_init() relies on
sdei_init() to detect the SDEI version and (un)register events. The
dependencies are as follows:

    ghes_init() => acpi_hest_init() => acpi_bus_init() => acpi_init()
    ghes_init() => sdei_init()

HEST is not PCI-specific and initcall ordering is implicit and not
well-defined within a level.

Based on above, remove acpi_hest_init() from acpi_pci_root_init() and
convert ghes_init() and sdei_init() from initcalls to explicit calls in the
following order:

    acpi_hest_init()
    ghes_init()
        sdei_init()

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-03-03 20:24:22 +01:00
Tony Luck
3ad6fd77a2 x86/sgx: Add check for SGX pages to ghes_do_memory_failure()
SGX EPC pages do not have a "struct page" associated with them so the
pfn_valid() sanity check fails and results in a warning message to
the console.

Add an additional check to skip the warning if the address of the error
is in an SGX EPC page.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20211026220050.697075-8-tony.luck@intel.com
2021-11-15 11:13:16 -08:00
Tony Luck
c6acb1e7bf x86/sgx: Add hook to error injection address validation
SGX reserved memory does not appear in the standard address maps.

Add hook to call into the SGX code to check if an address is located
in SGX memory.

There are other challenges in injecting errors into SGX. Update the
documentation with a sequence of operations to inject.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Tested-by: Reinette Chatre <reinette.chatre@intel.com>
Link: https://lkml.kernel.org/r/20211026220050.697075-7-tony.luck@intel.com
2021-11-15 11:13:16 -08:00
Christoph Hellwig
06606646af ACPI: APEI: mark apei_hest_parse() static
apei_hest_parse() is only used in hest.c, so mark it static.

Signed-off-by: Christoph Hellwig <hch@lst.de>
[ rjw: Minor subject and changelog edits ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-10-27 20:34:47 +02:00
Shuai Xue
bf7fc0c369 ACPI: APEI: EINJ: Relax platform response timeout to 1 second
When injecting an error into the platform, the OSPM executes an
EXECUTE_OPERATION action to instruct the platform to begin the injection
operation. And then, the OSPM busy waits for a while by continually
executing CHECK_BUSY_STATUS action until the platform indicates that the
operation is complete. More specifically, the platform is limited to
respond within 1 millisecond right now. This is too strict for some
platforms.

For example, in Arm platform, when injecting a Processor Correctable error,
the OSPM will warn:
    Firmware does not respond in time.

And a message is printed on the console:
    echo: write error: Input/output error

We observe that the waiting time for DDR error injection is about 10 ms and
that for PCIe error injection is about 500 ms in Arm platform.

In this patch, we relax the response timeout to 1 second.

Signed-off-by: Shuai Xue <xueshuai@linux.alibaba.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-10-27 20:33:26 +02:00
Xiaofei Tan
ccb5ecdc2d ACPI: APEI: fix synchronous external aborts in user-mode
Before commit 8fcc4ae6faf8 ("arm64: acpi: Make apei_claim_sea()
synchronise with APEI's irq work"), do_sea() would unconditionally
signal the affected task from the arch code. Since that change,
the GHES driver sends the signals.

This exposes a problem as errors the GHES driver doesn't understand
or doesn't handle effectively are silently ignored. It will cause
the errors get taken again, and circulate endlessly. User-space task
get stuck in this loop.

Existing firmware on Kunpeng9xx systems reports cache errors with the
'ARM Processor Error' CPER records.

Do memory failure handling for ARM Processor Error Section just like
for Memory Error Section.

Fixes: 8fcc4ae6faf8 ("arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work")
Signed-off-by: Xiaofei Tan <tanxiaofei@huawei.com>
Reviewed-by: James Morse <james.morse@arm.com>
[ rjw: Subject edit ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-06-17 14:05:28 +02:00
Jon Hunter
b7a732a73a ACPI: APEI: Don't warn if ACPI is disabled
If ACPI is not enabled but support for ACPI and APEI is enabled in the
kernel, then the following warning is seen on boot ...

 WARNING KERN EINJ: ACPI disabled.

For ARM64 platforms, the 'acpi_disabled' variable is true by default
and hence, the above is often seen on ARM64. Given that it can be
normal for ACPI to be disabled, make this an informational print rather
that a warning.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-05-21 18:59:49 +02:00
Rafael J. Wysocki
b6237f61fc Merge branch 'acpi-misc'
* acpi-misc:
  ACPI: dock: fix some coding style issues
  ACPI: sysfs: fix some coding style issues
  ACPI: PM: add a missed blank line after declarations
  ACPI: custom_method: fix a coding style issue
  ACPI: CPPC: fix some coding style issues
  ACPI: button: fix some coding style issues
  ACPI: battery: fix some coding style issues
  ACPI: acpi_pad: add a missed blank line after declarations
  ACPI: LPSS: add a missed blank line after declarations
  ACPI: ipmi: remove useless return statement for void function
  ACPI: processor: fix some coding style issues
  ACPI: APD: fix a block comment align issue
  ACPI: AC: fix some coding style issues
  ACPI: fix various typos in comments
2021-04-26 17:04:41 +02:00
Colin Ian King
c3f2311e4b ACPI: APEI: remove redundant assignment to variable rc
The variable rc is being assigned a value that is never read,
the assignment is redundant and can be removed.

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2021-04-21 18:51:25 +02:00