linux/drivers/acpi/apei
Prarit Bhargava a545715d2d ACPI / APEI: Fix NMI notification handling
When removing and adding cpu 0 on a system with GHES NMI the following stack
trace is seen when re-adding the cpu:

WARNING: CPU: 0 PID: 0 at arch/x86/kernel/apic/apic.c:1349 setup_local_APIC+
Modules linked in: nfsv3 rpcsec_gss_krb5 nfsv4 nfs fscache coretemp intel_ra
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.9.0-rc6+ #2
Call Trace:
 dump_stack+0x63/0x8e
 __warn+0xd1/0xf0
 warn_slowpath_null+0x1d/0x20
 setup_local_APIC+0x275/0x370
 apic_ap_setup+0xe/0x20
 start_secondary+0x48/0x180
 set_init_arg+0x55/0x55
 early_idt_handler_array+0x120/0x120
 x86_64_start_reservations+0x2a/0x2c
 x86_64_start_kernel+0x13d/0x14c

During the cpu bringup, wakeup_cpu_via_init_nmi() is called and issues an
NMI on CPU 0.  The GHES NMI handler, ghes_notify_nmi() runs the
ghes_proc_irq_work work queue which ends up setting IRQ_WORK_VECTOR
(0xf6).  The "faulty" IR line set at arch/x86/kernel/apic/apic.c:1349 is  also
0xf6 (specifically APIC IRR for irqs 255 to 224 is 0x400000) which confirms
that something has set the IRQ_WORK_VECTOR line prior to the APIC being
initialized.

Commit 2383844d48 ("GHES: Elliminate double-loop in the NMI handler")
incorrectly modified the behavior such that the handler returns
NMI_HANDLED only if an error was processed, and incorrectly runs the ghes
work queue for every NMI.

This patch modifies the ghes_proc_irq_work() to run as it did prior to
2383844d48 ("GHES: Elliminate double-loop in the NMI handler") by
properly returning NMI_HANDLED and only calling the work queue if
NMI_HANDLED has been set.

Fixes: 2383844d48 (GHES: Elliminate double-loop in the NMI handler)
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-12-02 00:20:02 +01:00
..
apei-base.c ACPI / APEI: Fix leaked resources 2016-03-11 00:13:25 +01:00
apei-internal.h ACPI / APEI: Add Boot Error Record Table (BERT) support 2016-06-29 23:35:05 +02:00
bert.c ACPI / APEI: Add Boot Error Record Table (BERT) support 2016-06-29 23:35:05 +02:00
einj.c ACPI / einj: Make error paths more talkative 2016-06-23 23:41:38 +02:00
erst-dbg.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
erst.c pstore: Split pstore fragile flags 2016-09-08 15:01:08 -07:00
ghes.c ACPI / APEI: Fix NMI notification handling 2016-12-02 00:20:02 +01:00
hest.c ACPI: Remove FSF mailing addresses 2015-07-08 02:27:32 +02:00
Kconfig acpi, apei, ghes: Make NMI error notification to be GHES architecture extension. 2014-07-22 15:05:06 -07:00
Makefile ACPI / APEI: Add Boot Error Record Table (BERT) support 2016-06-29 23:35:05 +02:00