linux/arch/x86
Jue Wang 8ca97812c3 x86/mce: Work around an erratum on fast string copy instructions
A rare kernel panic scenario can happen when the following conditions
are met due to an erratum on fast string copy instructions:

1) An uncorrected error.
2) That error must be in first cache line of a page.
3) Kernel must execute page_copy from the page immediately before that
page.

The fast string copy instructions ("REP; MOVS*") could consume an
uncorrectable memory error in the cache line _right after_ the desired
region to copy and raise an MCE.

Bit 0 of MSR_IA32_MISC_ENABLE can be cleared to disable fast string
copy and will avoid such spurious machine checks. However, that is less
preferable due to the permanent performance impact. Considering memory
poison is rare, it's desirable to keep fast string copy enabled until an
MCE is seen.

Intel has confirmed the following:
1. The CPU erratum of fast string copy only applies to Skylake,
Cascade Lake and Cooper Lake generations.

Directly return from the MCE handler:
2. Will result in complete execution of the "REP; MOVS*" with no data
loss or corruption.
3. Will not result in another MCE firing on the next poisoned cache line
due to "REP; MOVS*".
4. Will resume execution from a correct point in code.
5. Will result in the same instruction that triggered the MCE firing a
second MCE immediately for any other software recoverable data fetch
errors.
6. Is not safe without disabling the fast string copy, as the next fast
string copy of the same buffer on the same CPU would result in a PANIC
MCE.

This should mitigate the erratum completely with the only caveat that
the fast string copy is disabled on the affected hyper thread thus
performance degradation.

This is still better than the OS crashing on MCEs raised on an
irrelevant process due to "REP; MOVS*' accesses in a kernel context,
e.g., copy_page.

Tested:

Injected errors on 1st cache line of 8 anonymous pages of process
'proc1' and observed MCE consumption from 'proc2' with no panic
(directly returned).

Without the fix, the host panicked within a few minutes on a
random 'proc2' process due to kernel access from copy_page.

  [ bp: Fix comment style + touch ups, zap an unlikely(), improve the
    quirk function's readability. ]

Signed-off-by: Jue Wang <juew@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Link: https://lore.kernel.org/r/20220218013209.2436006-1-juew@google.com
2022-02-19 14:26:42 +01:00
..
boot Kbuild updates for v5.17 2022-01-19 11:15:19 +02:00
configs x86/kbuild: Enable CONFIG_KALLSYMS_ALL=y in the defconfigs 2022-01-08 22:55:29 +01:00
crypto lib/crypto: blake2s: avoid indirect calls to compression function for Clang CFI 2022-02-04 19:22:32 +01:00
entry Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2022-01-17 05:49:30 +02:00
events perf/x86/intel/pt: Fix crash with stop filters in single-range mode 2022-02-02 13:11:40 +01:00
hyperv hyperv-next for 5.17 2022-01-16 15:53:00 +02:00
ia32 audit/stable-5.16 PR 20211101 2021-11-01 21:17:39 -07:00
include Merge tip:locking/core into tip:ras/core 2022-02-13 22:03:04 +01:00
kernel x86/mce: Work around an erratum on fast string copy instructions 2022-02-19 14:26:42 +01:00
kvm KVM/arm64 fixes for 5.17, take #2 2022-02-05 00:58:25 -05:00
lib - Get rid of all the .fixup sections because this generates 2022-01-12 16:31:19 -08:00
math-emu x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
mm Merge branch 'akpm' (patches from Andrew) 2022-01-15 20:37:06 +02:00
net - Get rid of all the .fixup sections because this generates 2022-01-12 16:31:19 -08:00
pci PCI/sysfs: Find shadow ROM before static attribute initialization 2022-01-26 10:41:21 -06:00
platform - Get rid of all the .fixup sections because this generates 2022-01-12 16:31:19 -08:00
power x86: Prepare asm files for straight-line-speculation 2021-12-08 12:25:37 +01:00
purgatory x86/purgatory: Remove -nostdlib compiler flag 2021-12-30 14:13:06 +01:00
ras
realmode - Flush *all* mappings from the TLB after switching to the trampoline 2022-01-10 09:51:38 -08:00
tools x86/build: Use the proper name CONFIG_FW_LOADER 2021-12-29 22:20:38 +01:00
um bitmap patches for 5.17-rc1 2022-01-23 06:20:44 +02:00
video
xen xen/x2apic: Fix inconsistent indenting 2022-02-10 11:10:20 +01:00
.gitignore
Kbuild kbuild: use more subdir- for visiting subdirectories while cleaning 2021-10-24 13:49:46 +09:00
Kconfig ftrace: Have architectures opt-in for mcount build time sorting 2022-01-27 19:15:44 -05:00
Kconfig.assembler
Kconfig.cpu x86/mmx_32: Remove X86_USE_3DNOW 2021-12-11 09:09:45 +01:00
Kconfig.debug
Makefile x86: Add straight-line-speculation mitigation 2021-12-09 13:32:25 +01:00
Makefile_32.cpu
Makefile.um