212 Commits

Author SHA1 Message Date
Daniel Walter
164109e3cd arch/x86: replace strict_strto calls
Replace obsolete strict_strto calls with appropriate kstrto calls

Signed-off-by: Daniel Walter <dwalter@google.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-08-08 15:57:28 -07:00
Borislav Petkov
27c934158c x86, MCE: Robustify mcheck_init_device
BorisO reports that misc_register() fails often on xen. The current code
unregisters the CPU hotplug notifier in that case. If then a CPU is
offlined and onlined back again, we end up with a second timer running
on that CPU, leading to soft lockups and system hangs.

So let's leave the hotcpu notifier always registered - even if
mce_device_create failed for some cores and never unreg it so that we
can deal with the timer handling accordingly.

Reported-and-Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1403274493-1371-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2014-06-24 15:17:01 +02:00
Borislav Petkov
38356c1fbd x86, MCE: Kill CPU_POST_DEAD
In conjunction with cleaning up CPU hotplug, we want to get rid of
CPU_POST_DEAD. Kill this instance here and rediscover CMCI banks at the
end of CPU_DEAD.

Link: http://lkml.kernel.org/r/http://lkml.kernel.org/r/1400750624-19238-1-git-send-email-bp@alien8.de
Signed-off-by: Borislav Petkov <bp@suse.de>
2014-06-22 18:36:39 +02:00
Chen Yucong
65eb71823b hwpoison: remove unused global variable in do_machine_check()
Remove an unused global variable mce_entry and relative operations in
do_machine_check().

Signed-off-by: Chen Yucong <slaoub@gmail.com>
Cc: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-06-04 16:54:11 -07:00
Borislav Petkov
716079f66e mce: Panic when a core has reached a timeout
There is very little and maybe practically nothing we can do to recover
from a system where at least one core has reached a timeout during the
whole monarch cores gathering. So panic when that happens.

Link: http://lkml.kernel.org/r/20140523091041.GA21332@pd.tnic
Signed-off-by: Borislav Petkov <bp@suse.de>
2014-05-30 22:05:31 +02:00
Mathieu Souchaud
9c15a24b03 x86/mce: Improve mcheck_init_device() error handling
Check return code of every function called by mcheck_init_device().

Signed-off-by: Mathieu Souchaud <mattieu.souchaud@free.fr>
Link: http://lkml.kernel.org/r/1399151031-19905-1-git-send-email-mattieu.souchaud@free.fr
Signed-off-by: Borislav Petkov <bp@suse.de>
2014-05-30 22:01:40 +02:00
Ingo Molnar
ea431643d6 x86/mce: Fix CMCI preemption bugs
The following commit:

  27f6c573e0f7 ("x86, CMCI: Add proper detection of end of CMCI storms")

Added two preemption bugs:

 - machine_check_poll() does a get_cpu_var() without a matching
   put_cpu_var(), which causes preemption imbalance and crashes upon
   bootup.

 - it does percpu ops without disabling preemption. Preemption is not
   disabled due to the mistaken use of a raw spinlock.

To fix these bugs fix the imbalance and change
cmci_discover_lock to a regular spinlock.

Reported-by: Owen Kibel <qmewlo@gmail.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Chen, Gong <gong.chen@linux.intel.com>
Cc: Josh Boyer <jwboyer@fedoraproject.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Todorov <atodorov@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Link: http://lkml.kernel.org/n/tip-jtjptvgigpfkpvtQxpEk1at2@git.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
--
 arch/x86/kernel/cpu/mcheck/mce.c       |    4 +---
 arch/x86/kernel/cpu/mcheck/mce_intel.c |   18 +++++++++---------
 2 files changed, 10 insertions(+), 12 deletions(-)
2014-04-17 10:28:42 +02:00
Linus Torvalds
8eab6cd031 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Peter Anvin:
 "This is a collection of minor fixes for x86, plus the IRET information
  leak fix (forbid the use of 16-bit segments in 64-bit mode)"

NOTE! We may have to relax the "forbid the use of 16-bit segments in
64-bit mode" part, since there may be people who still run and depend on
16-bit Windows binaries under Wine.

But I'm taking this in the current unconditional form for now to see who
(if anybody) screams bloody murder.  Maybe nobody cares.  And maybe
we'll have to update it with some kind of runtime enablement (like our
vm.mmap_min_addr tunable that people who run dosemu/qemu/wine already
need to tweak).

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86-64, modify_ldt: Ban 16-bit segments on 64-bit kernels
  efi: Pass correct file handle to efi_file_{read,close}
  x86/efi: Correct EFI boot stub use of code32_start
  x86/efi: Fix boot failure with EFI stub
  x86/platform/hyperv: Handle VMBUS driver being a module
  x86/apic: Reinstate error IRQ Pentium erratum 3AP workaround
  x86, CMCI: Add proper detection of end of CMCI storms
2014-04-11 11:58:33 -07:00
Chen, Gong
27f6c573e0 x86, CMCI: Add proper detection of end of CMCI storms
When CMCI storm persists for a long time(at least beyond predefined
threshold. It's 30 seconds for now), we can watch CMCI storm is
detected immediately after it subsides.

...
Dec 10 22:04:29 kernel: CMCI storm detected: switching to poll mode
Dec 10 22:04:59 kernel: CMCI storm subsided: switching to interrupt mode
Dec 10 22:04:59 kernel: CMCI storm detected: switching to poll mode
Dec 10 22:05:29 kernel: CMCI storm subsided: switching to interrupt mode
...

The problem is that our logic that determines that the storm has
ended is incorrect. We announce the end, re-enable interrupts and
realize that the storm is still going on, so we switch back to
polling mode. Rinse, repeat.

When a storm happens we disable signaling of errors via CMCI and begin
polling machine check banks instead. If we find any logged errors,
then we need to set a per-cpu flag so that our per-cpu tests that
check whether the storm is ongoing will see that errors are still
being logged independently of whether mce_notify_irq() says that the
error has been fully processed.

cmci_clear() is not the right tool to disable a bank. It disables the
interrupt for the bank as desired, but it also clears the bit for
this bank in "mce_banks_owned" so we will skip the bank when polling
(so we fail to see that the storm continues because we stop looking).
New cmci_storm_disable_banks() just disables the interrupt while
allowing polling to continue.

Reported-by: William Dauchy <wdauchy@gmail.com>
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2014-03-28 13:40:16 -07:00
Srivatsa S. Bhat
82a8f131aa x86, mce: Fix CPU hotplug callback registration
Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

	get_online_cpus();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	register_cpu_notifier(&foobar_cpu_notifier);

	put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

	cpu_notifier_register_begin();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	/* Note the use of the double underscored version of the API */
	__register_cpu_notifier(&foobar_cpu_notifier);

	cpu_notifier_register_done();

Fix the mce code in x86 by using this latter form of callback registration.

Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2014-03-20 13:43:42 +01:00
Borislav Petkov
4f75d84127 x86, mce: Fix mce_start_timer semantics
So mce_start_timer() has a 'cpu' argument which is supposed to mean to
start a timer on that cpu. However, the code currently starts a timer on
the *current* cpu the function runs on and causes the sanity-check in
mce_timer_fn to fire:

	WARNING: CPU: 0 PID: 0 at arch/x86/kernel/cpu/mcheck/mce.c:1286 mce_timer_fn

because it is running on the wrong cpu.

This was triggered by Prarit Bhargava <prarit@redhat.com> by offlining
all the cpus in succession.

Then, we were fiddling with the CMCI storm settings when starting the
timer whereas there's no need for that - if there's storm happening
on this newly restarted cpu, we're going to be in normal CMCI mode
initially and then when the CMCI interrupt starts firing, we're going to
go to the polling mode with the timer real soon.

Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Prarit Bhargava <prarit@redhat.com>
Cc: Tony Luck <tony.luck@intel.com>
Reviewed-by: Chen, Gong <gong.chen@linux.intel.com>
Link: http://lkml.kernel.org/r/1387722156-5511-1-git-send-email-prarit@redhat.com
2014-01-12 15:22:25 +01:00
Levente Kurusa
853d9b18f1 x86, mce: Call put_device on device_register failure
This patch adds a call to put_device() when the device_register() call
has failed. This is required so that the last reference to the device is
given up.

Signed-off-by: Levente Kurusa <levex@linux.com>
Link: http://lkml.kernel.org/r/5298F900.9000208@linux.com
Signed-off-by: Borislav Petkov <bp@suse.de>
2013-11-30 12:14:51 +01:00
Ingo Molnar
0237d7f355 Merge branch 'x86/mce' into x86/ras
Pursue a single RAS/MCE topic branch on x86.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-08-12 17:54:05 +02:00
Paul Gortmaker
148f9bb877 x86: delete __cpuinit usage from all x86 files
The __cpuinit type of throwaway sections might have made sense
some time ago when RAM was more constrained, but now the savings
do not offset the cost and complications.  For example, the fix in
commit 5e427ec2d0 ("x86: Fix bit corruption at CPU resume time")
is a good example of the nasty type of bugs that can be created
with improper use of the various __init prefixes.

After a discussion on LKML[1] it was decided that cpuinit should go
the way of devinit and be phased out.  Once all the users are gone,
we can then finally remove the macros themselves from linux/init.h.

Note that some harmless section mismatch warnings may result, since
notify_cpu_starting() and cpu_up() are arch independent (kernel/cpu.c)
are flagged as __cpuinit  -- so if we remove the __cpuinit from
arch specific callers, we will also get section mismatch warnings.
As an intermediate step, we intend to turn the linux/init.h cpuinit
content into no-ops as early as possible, since that will get rid
of these warnings.  In any case, they are temporary and harmless.

This removes all the arch/x86 uses of the __cpuinit macros from
all C files.  x86 only had the one __CPUINIT used in assembly files,
and it wasn't paired off with a .previous or a __FINIT, so we can
delete it directly w/o any corresponding additional change there.

[1] https://lkml.org/lkml/2013/5/20/589

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
2013-07-14 19:36:56 -04:00
Naveen N. Rao
c3d1fb567a mce: acpi/apei: Honour Firmware First for MCA banks listed in APEI HEST CMC
The Corrected Machine Check structure (CMC) in HEST has a flag which can be
set by the firmware to indicate to the OS that it prefers to process the
corrected error events first. In this scenario, the OS is expected to not
monitor for corrected errors (through CMCI/polling). Instead, the firmware
notifies the OS on corrected error events through GHES.

Linux already has support for GHES. This patch adds support for parsing CMC
structure and to disable CMCI/polling if the firmware first flag is set.

Further, the list of machine check bank structures at the end of CMC is used
to determine which MCA banks function in FF mode, so that we continue to
monitor error events on the other banks.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-07-08 11:53:01 -07:00
Naveen N. Rao
0644414e62 mce: acpi/apei: Add comments to clarify usage of the various bitfields in the MCA subsystem
There is some confusion about the 'mce_poll_banks' and 'mce_banks_owned'
per-cpu bitmaps.  Provide comments so that we all know exactly what these
are used for, and why.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-06-25 13:53:27 -07:00
Srivatsa S. Bhat
7a0c819d28 x86/mce: Rework cmci_rediscover() to play well with CPU hotplug
Dave Jones reports that offlining a CPU leads to this trace:

numa_remove_cpu cpu 1 node 0: mask now 0,2-3
smpboot: CPU 1 is now offline
BUG: using smp_processor_id() in preemptible [00000000] code:
cpu-offline.sh/10591
caller is cmci_rediscover+0x6a/0xe0
Pid: 10591, comm: cpu-offline.sh Not tainted 3.9.0-rc3+ #2
Call Trace:
 [<ffffffff81333bbd>] debug_smp_processor_id+0xdd/0x100
 [<ffffffff8101edba>] cmci_rediscover+0x6a/0xe0
 [<ffffffff815f5b9f>] mce_cpu_callback+0x19d/0x1ae
 [<ffffffff8160ea66>] notifier_call_chain+0x66/0x150
 [<ffffffff8107ad7e>] __raw_notifier_call_chain+0xe/0x10
 [<ffffffff8104c2e3>] cpu_notify+0x23/0x50
 [<ffffffff8104c31e>] cpu_notify_nofail+0xe/0x20
 [<ffffffff815ef082>] _cpu_down+0x302/0x350
 [<ffffffff815ef106>] cpu_down+0x36/0x50
 [<ffffffff815f1c9d>] store_online+0x8d/0xd0
 [<ffffffff813edc48>] dev_attr_store+0x18/0x30
 [<ffffffff81226eeb>] sysfs_write_file+0xdb/0x150
 [<ffffffff811adfb2>] vfs_write+0xa2/0x170
 [<ffffffff811ae16c>] sys_write+0x4c/0xa0
 [<ffffffff81613019>] system_call_fastpath+0x16/0x1b

However, a look at cmci_rediscover shows that it can be simplified quite
a bit, apart from solving the above issue. It invokes functions that
take spin locks with interrupts disabled, and hence it can run in atomic
context. Also, it is run in the CPU_POST_DEAD phase, so the dying CPU
is already dead and out of the cpu_online_mask. So take these points into
account and simplify the code, and thereby also fix the above issue.

Reported-by: Dave Jones <davej@redhat.com>
Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2013-04-02 14:04:01 -07:00
Linus Torvalds
9043a2650c The sweeping change is to make add_taint() explicitly indicate whether to disable
lockdep, but it's a mechanical change.
 
 Cheers,
 Rusty.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJRJAcuAAoJENkgDmzRrbjxsw0P/3eXb+LddYnx0V0uHYdKpCUf
 4vdW7X0fX3Z+aUK69IWRL/6ahoO4TpaHYGHBDjEoivyQ0GDq14X7JNWsYYt3LdMf
 3wmDgRc2cn/mZOJbFeVpNV8ox5l/xc0CUvV+iQ8tMjfQItXMXgWUFZKMECsXKSO6
 eex3lrw9M2jAX2uL8LQPp9W8xtKu24nSZRC6tH5riE/8fCzi1cZPPAqfxP5c8Lee
 ZXtbCRSyAFENZLpKyMe1PC7HvtJyi5NDn9xwOQiXULZV/VOlvP94DGBLIKCM/6dn
 4QvZxpG0P0uOlpCgRAVLyh/z7g4XY4VF/fHopLCmEcqLsvgD+V2LQpQ9zWUalLPC
 Z+pUpz2vu0gIddPU1nR8R6oGpEdJ8O12aJle62p/RSXWZGx12qUQ+Tamu0tgKcv1
 AsiJfbUGNDYfxgU6sHsoQjl2f68LTVckCU1C1LqEbW/S104EIORtGx30CHM4LRiO
 32kDC5TtgYDBKQAIqJ4bL48ZMh+9W3uX40p7xzOI5khHQjvswUKa3jcxupU0C1uv
 lx8KXo7pn8WT33QGysWC782wJCgJuzSc2vRn+KQoqoynuHGM6agaEtR59gil3QWO
 rQEcxH63BBRDgHlg4FM9IkJwwsnC3PWKL8gbX0uAWXAPMbgapJkuuGZAwt0WDGVK
 +GszxsFkCjlW0mK0egTb
 =tiSY
 -----END PGP SIGNATURE-----

Merge tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux

Pull module update from Rusty Russell:
 "The sweeping change is to make add_taint() explicitly indicate whether
  to disable lockdep, but it's a mechanical change."

* tag 'modules-next-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux:
  MODSIGN: Add option to not sign modules during modules_install
  MODSIGN: Add -s <signature> option to sign-file
  MODSIGN: Specify the hash algorithm on sign-file command line
  MODSIGN: Simplify Makefile with a Kconfig helper
  module: clean up load_module a little more.
  modpost: Ignore ARC specific non-alloc sections
  module: constify within_module_*
  taint: add explicit flag to show whether lock dep is still OK.
  module: printk message when module signature fail taints kernel.
2013-02-25 15:41:43 -08:00
Rusty Russell
373d4d0997 taint: add explicit flag to show whether lock dep is still OK.
Fix up all callers as they were before, with make one change: an
unsigned module taints the kernel, but doesn't turn off lockdep.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2013-01-21 17:17:57 +10:30
Tejun Heo
4d899be584 x86/mce: don't use [delayed_]work_pending()
There's no need to test whether a (delayed) work item in pending
before queueing, flushing or cancelling it.  Most uses are unnecessary
and quite a few of them are buggy.

Remove unnecessary pending tests from x86/mce.  Only compile tested.

v2: Local var work removed from mce_schedule_work() as suggested by
    Borislav.

Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Borislav Petkov <bp@alien8.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: linux-edac@vger.kernel.org
2012-12-28 13:40:16 -08:00
Borislav Petkov
1462594bf2 x86, MCA: Finish mca_config conversion
mce_ser, mce_bios_cmci_threshold and mce_disabled are the last three
bools which need conversion. Move them to the mca_config struct and
adjust usage sites accordingly.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Acked-by: Tony Luck <tony.luck@intel.com>
2012-10-26 14:37:58 +02:00
Borislav Petkov
7af19e4afd x86, MCA: Convert the next three variables batch
Move them into the mca_config struct and adjust code touching them
accordingly.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Acked-by: Tony Luck <tony.luck@intel.com>
2012-10-26 14:37:57 +02:00
Borislav Petkov
84c2559dee x86, MCA: Convert rip_msr, mce_bootlog, monarch_timeout
Move above configuration variables into struct mca_config and adjust
usage places accordingly.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Acked-by: Tony Luck <tony.luck@intel.com>
2012-10-26 14:37:57 +02:00
Borislav Petkov
d203f0b824 x86, MCA: Convert dont_log_ce, banks and tolerant
Move those MCA configuration variables into struct mca_config and adjust
the places they're used accordingly.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Acked-by: Tony Luck <tony.luck@intel.com>
2012-10-26 14:37:56 +02:00
Borislav Petkov
5bc66170dc x86, MCE: Remove bios_cmci_threshold sysfs attribute
450cc201038f3 ("x86/mce: Provide boot argument to honour bios-set CMCI
threshold") added the bios_cmci_threshold sysfs attribute which was
supposed to communicate to userspace tools that BIOS CMCI threshold has
been honoured.

However, this info is not of any importance to userspace - it should
rather get the actual error count it has been thresholded already from
MCi_STATUS[38:52].

So drop this before it becomes a used interface (good thing we caught
this early in 3.7-rc1, right after the merge window closed).

Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Acked-by: Tony Luck <tony.luck@intel.com>
Link: http://lkml.kernel.org/r/20121017105940.GA14590@x1.osrc.amd.com
Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
2012-10-19 15:22:29 +02:00
Naveen N. Rao
450cc20103 x86/mce: Provide boot argument to honour bios-set CMCI threshold
The ACPI spec doesn't provide for a way for the bios to pass down
recommended thresholds to the OS on a _per-bank_ basis. This patch adds
a new boot option, which if passed, tells Linux to use CMCI thresholds
set by the bios.

As fail-safe, we initialize threshold to 1 if some banks have not been
initialized by the bios and warn the user.

Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-09-27 10:08:00 -07:00
Ingo Molnar
f1f6524476 Linux 3.6-rc6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.18 (GNU/Linux)
 
 iQEcBAABAgAGBQJQVkutAAoJEHm+PkMAQRiGW8sH/36FVQ3zI75QH16AmR++2nMZ
 BRJGoxcRFMssrXTYVdkMyzygf8b7MZbNEn1qt2g63MNzGaJucPlw5NVL4GLzR+zr
 x/EglLrTEPCD5el9wJ3ls9iC1soudKQTvC2BjcdUjpoSwHrDM/7GKfbOacE54Wqc
 C1VHCcg5DWOD7F0RnYT2SQEVCeDODNmcyFdk7Oi4cUicTPJoYWJ9O9MGfBDBok0N
 M+dXxa9nvsl7EeEKpBKH9vo4TfXn3Gsj6LCRdedvI15ilZjfo8jdHYbSn7KBfQuZ
 JIKRnqkaQ1JfMFt+M/JJZ1b/+Wrd4HLMmmn5oUmrGGIvhpi32nJfi/97+nSy8iU=
 =c5gW
 -----END PGP SIGNATURE-----

Merge tag 'v3.6-rc6' into x86/mce

Merge Linux v3.6-rc6, to refresh this tree.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-09-19 17:01:25 +02:00
Chen Gong
55babd8f41 x86/mce: Add CMCI poll mode
On Intel systems corrected machine check interrupts (CMCI) may be sent to
multiple logical processors; possibly to all processors on the affected
socket (SDM Volume 3B "15.5.1 CMCI Local APIC Interface").  This means
that a persistent error (such as a stuck bit in ECC memory) may cause
a storm of interrupts that greatly hinders or prevents forward progress
(probably on many processors).

To solve this we keep track of the rate at which each processor sees
CMCI. If we exceed a threshold, we disable CMCI delivery and switch to
polling the machine check banks. If the storm subsides (none of the
affected processors see any more errors for a complete poll interval) we
re-enable CMCI.

[Tony: Added console messages when storm begins/ends and increased storm
threshold from 5 to 15 so we have a few more logged entries before we
disable interrupts and start dropping reports]

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-08-09 11:44:51 -07:00
Thomas Gleixner
1a65f970d1 x86: mce: Remove the frozen cases in the hotplug code
No point in having double cases if we can simply mask the FROZEN bit
out.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-08-03 11:46:50 -07:00
Thomas Gleixner
26c3c283c5 x86: mce: Split timer init
Split timer init function into the init and the start part, so the
start part can replace the open coded version in CPU_DOWN_FAILED.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-08-03 11:46:29 -07:00
Linus Torvalds
1ca0049f2c Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Ingo Molnar:
 "Various fixes"

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86-64, kcmp: The kcmp system call can be common
  arch/x86/kernel/kdebugfs.c: Ensure a consistent return value in error case
  x86/mce: Add quirk for instruction recovery on Sandy Bridge processors
  x86/mce: Move MCACOD defines from mce-severity.c to <asm/mce.h>
  x86/ioapic: Fix NULL pointer dereference on CPU hotplug after disabling irqs
  x86, nops: Missing break resulting in incorrect selection on Intel
  x86: CONFIG_CC_STACKPROTECTOR=y is no longer experimental
2012-08-03 10:59:36 -07:00
Tony Luck
61b0fccd7f x86/mce: Add quirk for instruction recovery on Sandy Bridge processors
Sandy Bridge processors follow the SDM (Vol 3B, Table 15-20) and
set both the RIPV and EIPV bits in the MCG_STATUS register to
zero for machine checks during instruction fetch. This is more
than a little counter-intuitive and means that Linux cannot
recover from these errors. Rather than insert special case code
at several places in mce.c and mce-severity.c, we pretend the
EIPV bit was set for just this case early in processing the
machine check.

Acked-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Cc: Chen Gong <gong.chen@linux.intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Hidetoshi Seto <seto.hidetoshi@jp.fujitsu.com>
Link: http://lkml.kernel.org/r/180a06f3f357cf9f78259ae443a082b14a29535b.1343078495.git.tony.luck@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-07-26 15:05:47 +02:00
Linus Torvalds
62c4d9afa4 Features:
* Performance improvement to lower the amount of traps the hypervisor
    has to do 32-bit guests. Mainly for setting PTE entries and updating
    TLS descriptors.
  * MCE polling driver to collect hypervisor MCE buffer and present them to
    /dev/mcelog.
  * Physical CPU online/offline support. When an privileged guest is booted
    it is present with virtual CPUs, which might have an 1:1 to physical
    CPUs but usually don't. This provides mechanism to offline/online physical
    CPUs.
 Bug-fixes for:
  * Coverity found fixes in the console and ACPI processor driver.
  * PVonHVM kexec fixes along with some cleanups.
  * Pages that fall within E820 gaps and non-RAM regions (and had been
    released to hypervisor) would be populated back, but potentially in
    non-RAM regions.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.12 (GNU/Linux)
 
 iQEcBAABAgAGBQJQDWcvAAoJEFjIrFwIi8fJ6GAH/iFIkOC5wseD8qZ9nV4VI46t
 0GYvBFC4F91NvC7CNfoAySr84v+ZORIZzMcdyDF8H/tLO9MaOY/Mwn0S5ZSqmYMi
 rhskvK3InBaVkYtceOHugNGM7mB0c3STIm7OsjW6gbVzohmTN25rbQR+X5iWAtVA
 cTUtDyH3AU15mwuVT3U+VC4IulHpnNJz4pHoq3Sn61/UK1LYmhLXYd5fveA0D0B8
 lRZTAvNMsYDJDDmkWNrs8RczKkQ86DTSjfGawm0YG+Gf94GgD5yMHWbiHh2Gy93e
 u7sHK0RrKbP5BY/MV6vVJxkoV5NoWgCc0tcjBcYwdyvwzxDS75UhV6uoVHC3Ao8=
 =drt2
 -----END PGP SIGNATURE-----

Merge tag 'stable/for-linus-3.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

Pull Xen update from Konrad Rzeszutek Wilk:
 "Features:
   * Performance improvement to lower the amount of traps the hypervisor
     has to do 32-bit guests.  Mainly for setting PTE entries and
     updating TLS descriptors.
   * MCE polling driver to collect hypervisor MCE buffer and present
     them to /dev/mcelog.
   * Physical CPU online/offline support.  When an privileged guest is
     booted it is present with virtual CPUs, which might have an 1:1 to
     physical CPUs but usually don't.  This provides mechanism to
     offline/online physical CPUs.
  Bug-fixes for:
   * Coverity found fixes in the console and ACPI processor driver.
   * PVonHVM kexec fixes along with some cleanups.
   * Pages that fall within E820 gaps and non-RAM regions (and had been
     released to hypervisor) would be populated back, but potentially in
     non-RAM regions."

* tag 'stable/for-linus-3.6-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
  xen: populate correct number of pages when across mem boundary (v2)
  xen PVonHVM: move shared_info to MMIO before kexec
  xen: simplify init_hvm_pv_info
  xen: remove cast from HYPERVISOR_shared_info assignment
  xen: enable platform-pci only in a Xen guest
  xen/pv-on-hvm kexec: shutdown watches from old kernel
  xen/x86: avoid updating TLS descriptors if they haven't changed
  xen/x86: add desc_equal() to compare GDT descriptors
  xen/mm: zero PTEs for non-present MFNs in the initial page table
  xen/mm: do direct hypercall in xen_set_pte() if batching is unavailable
  xen/hvc: Fix up checks when the info is allocated.
  xen/acpi: Fix potential memory leak.
  xen/mce: add .poll method for mcelog device driver
  xen/mce: schedule a workqueue to avoid sleep in atomic context
  xen/pcpu: Xen physical cpus online/offline sys interface
  xen/mce: Register native mce handler as vMCE bounce back point
  x86, MCE, AMD: Adjust initcall sequence for xen
  xen/mce: Add mcelog support for Xen platform
2012-07-24 13:14:03 -07:00
Linus Torvalds
5b160bd426 Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86/mce changes from Ingo Molnar:
 "This tree improves the AMD thresholding bank code and includes a
  memory fault signal handling fixlet."

* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/mce: Fix siginfo_t->si_addr value for non-recoverable memory faults
  x86, MCE, AMD: Update copyrights and boilerplate
  x86, MCE, AMD: Give proper names to the thresholding banks
  x86, MCE, AMD: Make error_count read only
  x86, MCE, AMD: Cleanup reading of error_count
  x86, MCE, AMD: Print decimal thresholding values
  x86, MCE, AMD: Move shared bank to node descriptor
  x86, MCE, AMD: Remove local_allocate_... wrapper
  x86, MCE, AMD: Remove shared banks sysfs linking
  x86, amd_nb: Export model 0x10 and later PCI id
2012-07-22 16:07:45 -07:00
Liu, Jinsong
cef12ee52b xen/mce: Add mcelog support for Xen platform
When MCA error occurs, it would be handled by Xen hypervisor first,
and then the error information would be sent to initial domain for logging.

This patch gets error information from Xen hypervisor and convert
Xen format error into Linux format mcelog. This logic is basically
self-contained, not touching other kernel components.

By using tools like mcelog tool users could read specific error information,
like what they did under native Linux.

To test follow directions outlined in Documentation/acpi/apei/einj.txt

Acked-and-tested-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Ke, Liping <liping.ke@intel.com>
Signed-off-by: Jiang, Yunhong <yunhong.jiang@intel.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2012-07-19 15:51:36 -04:00
Tony Luck
6751ed65dc x86/mce: Fix siginfo_t->si_addr value for non-recoverable memory faults
In commit dad1743e5993f1 ("x86/mce: Only restart instruction after machine
check recovery if it is safe") we fixed mce_notify_process() to force a
signal to the current process if it was not restartable (RIPV bit not
set in MCG_STATUS). But doing it here means that the process doesn't
get told the virtual address of the fault via siginfo_t->si_addr. This
would prevent application level recovery from the fault.

Make a new MF_MUST_KILL flag bit for memory_failure() et al. to use so
that we will provide the right information with the signal.

Signed-off-by: Tony Luck <tony.luck@intel.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: stable@kernel.org    # 3.4+
2012-07-11 10:20:47 -07:00
Ingo Molnar
6a991accee Merge commit 'v3.5-rc3' into x86/debug
Merge it in to pick up a fix that we are going to clean up in this
branch.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-20 14:22:34 +02:00
Thomas Gleixner
1a87fc1ec7 x86: mce: Add the dropped timer interval init back
commit 82f7af09 ("x86/mce: Cleanup timer mess) dropped the
initialization of the per cpu timer interval. Duh :(

Restore the previous behaviour.

Reported-by: Chen Gong <gong.chen@linux.intel.com>
Cc: bp@amd64.org
Cc: tony.luck@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2012-06-06 11:33:21 +02:00
Joe Perches
c767a54ba0 x86/debug: Add KERN_<LEVEL> to bare printks, convert printks to pr_<level>
Use a more current logging style:

 - Bare printks should have a KERN_<LEVEL> for consistency's sake
 - Add pr_fmt where appropriate
 - Neaten some macro definitions
 - Convert some Ok output to OK
 - Use "%s: ", __func__ in pr_fmt for summit
 - Convert some printks to pr_<level>

Message output is not identical in all cases.

Signed-off-by: Joe Perches <joe@perches.com>
Cc: levinsasha928@gmail.com
Link: http://lkml.kernel.org/r/1337655007.24226.10.camel@joe2Laptop
[ merged two similar patches, tidied up the changelog ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 09:17:22 +02:00
Chen Gong
958fb3c512 x86/mce: Fix the MCE poll timer logic
In commit 82f7af09 ("x86/mce: Cleanup timer mess), Thomas just
forgot the "/ 2" there while cleaning up.

Signed-off-by: Chen Gong <gong.chen@linux.intel.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bp@amd64.org
Cc: tony.luck@intel.com
Link: http://lkml.kernel.org/r/1338863702-9245-1-git-send-email-gong.chen@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-06-06 08:28:21 +02:00
Linus Torvalds
2d117403b3 One more mce cleanup before the 3.5 merge window closes
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPxpZ1AAoJEKurIx+X31iBACgP/jZiXAdu9JjYqvPA8/ACsntt
 S0fpbn40kKftn58x6Ddqemu/rg8Iy/ezsB7Fm93TYfBzDb8RGdkSiLj/KoO39Jzy
 Tmod/vJ0XMsBFgDob5GSKSOYMsnkO7//6jtnM09A8wDlxUwE3LV/3/84EqOrQH0n
 LR3Ltd6wnCc/HVasdDZX/814QcHe5KSoFMY2jUHWf0suOKcI22X57PQt9831bKky
 tvvsqFKSaOaerW9F1atwB2Qx37f0pUCu/Qo4KmBB0EVwSapRpHSDu657byft7VWQ
 FJ1eLfZF7lnVaYHxyCqz+wgTVBTsBPDt1xGkIJqSMMHtziG5iP3m0NSYLCJ5XJYx
 AcmF55hIx4yMCrIdiKc7wWs2K4U59+FiGJ2LgFCBZ3XLJoAuZnhf+WOh4ws/wi/j
 qDk9vK8KG3VBYIwZnTWb6N1bRtzznp1ElXjZR1byh/Cu2Ne/EWL64VydNT7ts0Ga
 3mGF88keAlw04qHYtU/7y5WrHRKGV5LBjGujVV2e2a5dC6p7LO11UCLptEv11DWS
 LcbIegbrimWmMxVScJQkL12GEzZKHpZZvrFRIKiWkTA15R6R1OTC7VrywA2GjDbd
 h5mWKd7zV6Ankjmq9SuvnA1UazhC/r+uQ58INVpFVM+aFor32HVd6L7VRns7sv2Z
 WbQNYxv5O+D3J2K+dQNy
 =4tLS
 -----END PGP SIGNATURE-----

Merge tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull mce cleanup from Tony Luck:
 "One more mce cleanup before the 3.5 merge window closes"

* tag 'please-pull-mce' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  x86/mce: Cleanup timer mess
2012-05-31 10:53:37 -07:00
Thomas Gleixner
82f7af09e6 x86/mce: Cleanup timer mess
Use unsigned long for dealing with jiffies not int. Rename the
callback to something sensible. Use __this_cpu_read/write for
accessing per cpu data.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-05-30 14:40:01 -07:00
H. Peter Anvin
bbd771474e Merge branch 'x86/trampoline' into x86/urgent
x86/trampoline contains an urgent commit which is necessarily on a
newer baseline.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2012-05-30 12:11:32 -07:00
Ingo Molnar
403e1c5b74 Merge branch 'x86/mce' into x86/urgent
Merge in these fixlets.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-30 14:12:06 +02:00
Linus Torvalds
786f02b719 x86/mce merge window patches (including two that make error_context() checks less sucky)
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQIcBAABAgAGBQJPv7OiAAoJEKurIx+X31iBQxEP/RV8YO4nrozhHY597qabzfJc
 4YoCOma+wUyhXPDmZI80XvrIlcCq7TEJL1HTaAyA5rnyYvRHpM+uXDCRmbJDI4e0
 gA42/Y8+lbmR6BLY8sdptCXIWxw/d8wYEKK2BgNsPhkJxODGW/gVAws93erist/v
 yepq+GwI0QGAeRlO6AYgE7NwQmHXK5AdfH3phHUTVABVIUGH5+Zp6471FTB+hYy5
 aNvUnL0hw8vxrbpDL/Le359etPqC6wsELCIQ9wVtWCD0/UJM6Yd3j0+CKQ7q/KHU
 7zMcP+OCGTJ3koMhEbFIOnxuswWDGq5y/qIzSXMEEemGqgxqFUvX3wUeZ3HFAFNx
 nJ7ZaA813t7Bud4G4WwESxMGQpxI7FTvxnF1ow3IlRsMtV4ffvAS9xvWi0GQJrVY
 xixK7G87PGAm6fP9Zbb/lQlRO8gD498j4rfI9MOsUuY9QgFNcH2eg6c4O0iHDpos
 WkSgUaM49Q610JslrxsXp+BZZLBF/wbcjcFiQGFAWOIKTKgRQ99+dXAQY7fw9CIf
 /wNl9MkbOvJdPL9OfLTmAYAMXyaXbOX8qcvItwqcBsUT0AV863NdIXtS4BXBOrMs
 5u16CDX1ieFAlA2dzhynvE0Zd1Ws6wfe5W/BgtQ+H175uHFr8pHAxsBTX8GSNrXG
 /bSWWrR3CIBRHoWCJMmH
 =kG4e
 -----END PGP SIGNATURE-----

Merge tag 'x86-mce-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras

Pull x86/mce merge window patches from Tony Luck:
 "Including two that make error_context() checks less sucky"

* tag 'x86-mce-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras:
  x86/mce: Add instruction recovery signatures to mce-severity table
  x86/mce: Fix check for processor context when machine check was taken.
  MCE: Fix vm86 handling for 32bit mce handler
  x86/mce Add validation check before GHES error is recorded
  x86/mce: Avoid reading every machine check bank register twice.
2012-05-25 16:14:12 -07:00
Andi Kleen
a129a7c845 MCE: Fix vm86 handling for 32bit mce handler
When running on 32bit the mce handler could misinterpret
vm86 mode as ring 0. This can affect whether it does recovery
or not; it was possible to panic when recovery was actually
possible.

Fix this by always forcing vm86 to look like ring 3.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-05-23 14:22:37 -07:00
Linus Torvalds
70311aaa8a Merge branch 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull MCE updates from Ingo Molnar:
 "This tree updates/fixes MCE hardware support, it makes the APIC LVT
  thresholding interrupt optional because a subset of AMD F15h models
  don't support it."

* 'x86-mce-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86, MCE, AMD: Disable error thresholding bank 4 on some models
  x86, MCE, AMD: Hide interrupt_enable sysfs node
  x86, MCE, AMD: Make APIC LVT thresholding interrupt optional
2012-05-23 11:01:52 -07:00
Borislav Petkov
80f033610f x86/mce: Fix 32-bit build
Got bitten again by the BIT() macro:

 arch/x86/kernel/cpu/mcheck/mce.c: In function '__mcheck_cpu_apply_quirks':
 arch/x86/kernel/cpu/mcheck/mce.c:1453:6: warning: left shift
 count >= width of type arch/x86/kernel/cpu/mcheck/mce.c:1454:7: warning: left shift count >= width of type

Fix it already.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Frank Arnold <frank.arnold@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1337684026-19740-2-git-send-email-bp@amd64.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2012-05-23 17:16:43 +02:00
Linus Torvalds
f5c101892f Merge branch 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
Pull percpu updates from Tejun Heo:
 "Contains Alex Shi's three patches to remove percpu_xxx() which overlap
  with this_cpu_xxx().  There shouldn't be any functional change."

* 'for-3.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
  percpu: remove percpu_xxx() functions
  x86: replace percpu_xxx funcs with this_cpu_xxx
  net: replace percpu_xxx funcs with this_cpu_xxx or __this_cpu_xxx
2012-05-22 17:37:47 -07:00
Tony Luck
dad1743e59 x86/mce: Only restart instruction after machine check recovery if it is safe
Section 15.3.1.2 of the software developer manual has this to say about the
RIPV bit in the IA32_MCG_STATUS register:

  RIPV (restart IP valid) flag, bit 0 — Indicates (when set) that program
  execution can be restarted reliably at the instruction pointed to by the
  instruction pointer pushed on the stack when the machine-check exception
  is generated.  When clear, the program cannot be reliably restarted at
  the pushed instruction pointer.

We need to save the state of this bit in do_machine_check() and use it
in mce_notify_process() to force a signal; even if memory_failure() says
it made a complete recovery ... e.g. replaced a clean LRU page.

Acked-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Tony Luck <tony.luck@intel.com>
2012-05-14 15:07:48 -07:00