linux/arch/x86
Sean Christopherson 21a36ac6b6 KVM: x86/mmu: Re-check under lock that TDP MMU SP hugepage is disallowed
Re-check sp->nx_huge_page_disallowed under the tdp_mmu_pages_lock spinlock
when adding a new shadow page in the TDP MMU.  To ensure the NX reclaim
kthread can't see a not-yet-linked shadow page, the page fault path links
the new page table prior to adding the page to possible_nx_huge_pages.

If the page is zapped by different task, e.g. because dirty logging is
disabled, between linking the page and adding it to the list, KVM can end
up triggering use-after-free by adding the zapped SP to the aforementioned
list, as the zapped SP's memory is scheduled for removal via RCU callback.
The bug is detected by the sanity checks guarded by CONFIG_DEBUG_LIST=y,
i.e. the below splat is just one possible signature.

  ------------[ cut here ]------------
  list_add corruption. prev->next should be next (ffffc9000071fa70), but was ffff88811125ee38. (prev=ffff88811125ee38).
  WARNING: CPU: 1 PID: 953 at lib/list_debug.c:30 __list_add_valid+0x79/0xa0
  Modules linked in: kvm_intel
  CPU: 1 PID: 953 Comm: nx_huge_pages_t Tainted: G        W          6.1.0-rc4+ #71
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
  RIP: 0010:__list_add_valid+0x79/0xa0
  RSP: 0018:ffffc900006efb68 EFLAGS: 00010286
  RAX: 0000000000000000 RBX: ffff888116cae8a0 RCX: 0000000000000027
  RDX: 0000000000000027 RSI: 0000000100001872 RDI: ffff888277c5b4c8
  RBP: ffffc90000717000 R08: ffff888277c5b4c0 R09: ffffc900006efa08
  R10: 0000000000199998 R11: 0000000000199a20 R12: ffff888116cae930
  R13: ffff88811125ee38 R14: ffffc9000071fa70 R15: ffff88810b794f90
  FS:  00007fc0415d2740(0000) GS:ffff888277c40000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000000115201006 CR4: 0000000000172ea0
  Call Trace:
   <TASK>
   track_possible_nx_huge_page+0x53/0x80
   kvm_tdp_mmu_map+0x242/0x2c0
   kvm_tdp_page_fault+0x10c/0x130
   kvm_mmu_page_fault+0x103/0x680
   vmx_handle_exit+0x132/0x5a0 [kvm_intel]
   vcpu_enter_guest+0x60c/0x16f0
   kvm_arch_vcpu_ioctl_run+0x1e2/0x9d0
   kvm_vcpu_ioctl+0x271/0x660
   __x64_sys_ioctl+0x80/0xb0
   do_syscall_64+0x2b/0x50
   entry_SYSCALL_64_after_hwframe+0x46/0xb0
   </TASK>
  ---[ end trace 0000000000000000 ]---

Fixes: 61f9447854 ("KVM: x86/mmu: Set disallowed_nx_huge_page in TDP MMU before setting SPTE")
Reported-by: Greg Thelen <gthelen@google.com>
Analyzed-by: David Matlack <dmatlack@google.com>
Cc: David Matlack <dmatlack@google.com>
Cc: Ben Gardon <bgardon@google.com>
Cc: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221213033030.83345-4-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-12-23 12:33:53 -05:00
..
boot - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
coco x86/tdx: Panic on bad configs that #VE on "private" memory access 2022-11-01 16:02:40 -07:00
configs x86/defconfig: Enable CONFIG_DEBUG_WX=y 2022-09-02 10:41:42 +02:00
crypto crypto: x86/polyval - Fix crashes when keys are not 16-byte aligned 2022-10-21 19:05:05 +08:00
entry treewide: use prandom_u32_max() when possible, part 1 2022-10-11 17:42:55 -06:00
events perf/x86/core: Zero @lbr instead of returning -1 in x86_perf_get_lbr() stub 2022-11-09 12:31:11 -05:00
hyperv x86/hyperv: Replace kmap() with kmap_local_page() 2022-10-03 08:49:48 +00:00
ia32
include Merge remote-tracking branch 'kvm/queue' into HEAD 2022-12-12 15:54:07 -05:00
kernel x86/kvm: Remove unused virt to phys translation in kvm_guest_cpu_init() 2022-11-09 12:31:15 -05:00
kvm KVM: x86/mmu: Re-check under lock that TDP MMU SP hugepage is disallowed 2022-12-23 12:33:53 -05:00
lib - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
math-emu
mm x86/mm: Do not verify W^X at boot up 2022-10-24 18:05:27 -07:00
net bpf: Fix dispatcher patchable function entry to 5 bytes nop 2022-10-20 18:57:51 -07:00
pci x86/PCI: Revert "x86/PCI: Clip only host bridge windows for E820 regions" 2022-06-17 14:24:14 -05:00
platform EFI updates for v6.1 2022-10-09 08:56:54 -07:00
power
purgatory x86/purgatory: disable KMSAN instrumentation 2022-10-28 13:37:23 -07:00
ras
realmode x86: kmsan: disable instrumentation of unsupported code 2022-10-03 14:03:24 -07:00
tools x86/tools/relocs: Ignore __kcfi_typeid_ relocations 2022-09-26 10:13:15 -07:00
um arch: um: Mark the stack non-executable to fix a binutils warning 2022-09-21 09:11:42 +02:00
video
virt/vmx/tdx
xen xen: branch for v6.1-rc4 2022-11-06 10:42:29 -08:00
.gitignore x86/purgatory: Omit use of bin2c 2022-07-25 10:32:32 +02:00
Kbuild
Kconfig x86/Kconfig: Drop check for -mabi=ms for CONFIG_EFI_STUB 2022-10-17 19:11:16 +02:00
Kconfig.assembler
Kconfig.cpu
Kconfig.debug arch: make TRACE_IRQFLAGS_NMI_SUPPORT generic 2022-06-23 15:39:21 +01:00
Makefile Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Makefile_32.cpu
Makefile.um