IF YOU WOULD LIKE TO GET AN ACCOUNT, please write an
email to Administrator. User accounts are meant only to access repo
and report issues and/or generate pull requests.
This is a purpose-specific Git hosting for
BaseALT
projects. Thank you for your understanding!
Только зарегистрированные пользователи имеют доступ к сервису!
Для получения аккаунта, обратитесь к администратору.
commit 6f0f2d5ef895d66a3f2b32dd05189ec34afa5a55 upstream.
By default, KVM/SVM will intercept attempts by the guest to transition
out of C0. However, the KVM_CAP_X86_DISABLE_EXITS capability can be used
by a VMM to change this behavior. To mitigate the cross-thread return
address predictions bug (X86_BUG_SMT_RSB), a VMM must not be allowed to
override the default behavior to intercept C0 transitions.
Use a module parameter to control the mitigation on processors that are
vulnerable to X86_BUG_SMT_RSB. If the processor is vulnerable to the
X86_BUG_SMT_RSB bug and the module parameter is set to mitigate the bug,
KVM will not allow the disabling of the HLT, MWAIT and CSTATE exits.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Message-Id: <4019348b5e07148eb4d593380a5f6713b93c9a16.1675956146.git.thomas.lendacky@amd.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit a44b331614e6f7e63902ed7dff7adc8c85edd8bc upstream.
When serializing and deserializing kvm_sregs, attributes of the segment
descriptors are stored by user space. For unusable segments,
vmx_segment_access_rights skips all attributes and sets them to 0.
This means we zero out the DPL (Descriptor Privilege Level) for unusable
entries.
Unusable segments are - contrary to their name - usable in 64bit mode and
are used by guests to for example create a linear map through the
NULL selector.
VMENTER checks if SS.DPL is correct depending on the CS segment type.
For types 9 (Execute Only) and 11 (Execute Read), CS.DPL must be equal to
SS.DPL [1].
We have seen real world guests setting CS to a usable segment with DPL=3
and SS to an unusable segment with DPL=3. Once we go through an sregs
get/set cycle, SS.DPL turns to 0. This causes the virtual machine to crash
reproducibly.
This commit changes the attribute logic to always preserve attributes for
unusable segments. According to [2] SS.DPL is always saved on VM exits,
regardless of the unusable bit so user space applications should have saved
the information on serialization correctly.
[3] specifies that besides SS.DPL the rest of the attributes of the
descriptors are undefined after VM entry if unusable bit is set. So, there
should be no harm in setting them all to the previous state.
[1] Intel SDM Vol 3C 26.3.1.2 Checks on Guest Segment Registers
[2] Intel SDM Vol 3C 27.3.2 Saving Segment Registers and Descriptor-Table
Registers
[3] Intel SDM Vol 3C 26.3.2.2 Loading Guest Segment Registers and
Descriptor-Table Registers
Cc: Alexander Graf <graf@amazon.de>
Cc: stable@vger.kernel.org
Signed-off-by: Hendrik Borghorst <hborghor@amazon.de>
Reviewed-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Alexander Graf <graf@amazon.com>
Message-Id: <20221114164823.69555-1-hborghor@amazon.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 45e966fcca03ecdcccac7cb236e16eea38cc18af upstream.
Passing the host topology to the guest is almost certainly wrong
and will confuse the scheduler. In addition, several fields of
these CPUID leaves vary on each processor; it is simply impossible to
return the right values from KVM_GET_SUPPORTED_CPUID in such a way that
they can be passed to KVM_SET_CPUID2.
The values that will most likely prevent confusion are all zeroes.
Userspace will have to override it anyway if it wishes to present a
specific topology to the guest.
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 31de69f4eea77b28a9724b3fa55aae104fc91fc7 upstream.
Set ENABLE_USR_WAIT_PAUSE in KVM's supported VMX MSR configuration if the
feature is supported in hardware and enabled in KVM's base, non-nested
configuration, i.e. expose ENABLE_USR_WAIT_PAUSE to L1 if it's supported.
This fixes a bug where saving/restoring, i.e. migrating, a vCPU will fail
if WAITPKG (the associated CPUID feature) is enabled for the vCPU, and
obviously allows L1 to enable the feature for L2.
KVM already effectively exposes ENABLE_USR_WAIT_PAUSE to L1 by stuffing
the allowed-1 control ina vCPU's virtual MSR_IA32_VMX_PROCBASED_CTLS2 when
updating secondary controls in response to KVM_SET_CPUID(2), but (a) that
depends on flawed code (KVM shouldn't touch VMX MSRs in response to CPUID
updates) and (b) runs afoul of vmx_restore_control_msr()'s restriction
that the guest value must be a strict subset of the supported host value.
Although no past commit explicitly enabled nested support for WAITPKG,
doing so is safe and functionally correct from an architectural
perspective as no additional KVM support is needed to virtualize TPAUSE,
UMONITOR, and UMWAIT for L2 relative to L1, and KVM already forwards
VM-Exits to L1 as necessary (commit bf653b78f960, "KVM: vmx: Introduce
handle_unexpected_vmexit and handle WAITPKG vmexit").
Note, KVM always keeps the hosts MSR_IA32_UMWAIT_CONTROL resident in
hardware, i.e. always runs both L1 and L2 with the host's power management
settings for TPAUSE and UMWAIT. See commit bf09fb6cba4f ("KVM: VMX: Stop
context switching MSR_IA32_UMWAIT_CONTROL") for more details.
Fixes: e69e72faa3a0 ("KVM: x86: Add support for user wait instructions")
Cc: stable@vger.kernel.org
Reported-by: Aaron Lewis <aaronlewis@google.com>
Reported-by: Yu Zhang <yu.c.zhang@linux.intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20221213062306.667649-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ef40757743b47cc95de9b4ed41525c94f8dc73d9 upstream.
When a VM reboots itself, the reset process will result in
an ioctl(KVM_SET_LAPIC, ...) to disable x2APIC mode and set
the xAPIC id of the vCPU to its default value, which is the
vCPU id.
That will be handled in KVM as follows:
kvm_vcpu_ioctl_set_lapic
kvm_apic_set_state
kvm_lapic_set_base => disable X2APIC mode
kvm_apic_state_fixup
kvm_lapic_xapic_id_updated
kvm_xapic_id(apic) != apic->vcpu->vcpu_id
kvm_set_apicv_inhibit(APICV_INHIBIT_REASON_APIC_ID_MODIFIED)
memcpy(vcpu->arch.apic->regs, s->regs, sizeof(*s)) => update APIC_ID
When kvm_apic_set_state invokes kvm_lapic_set_base to disable
x2APIC mode, the old 32-bit x2APIC id is still present rather
than the 8-bit xAPIC id. kvm_lapic_xapic_id_updated will set the
APICV_INHIBIT_REASON_APIC_ID_MODIFIED bit and disable APICv/x2AVIC.
Instead, kvm_lapic_xapic_id_updated must be called after APIC_ID is
changed.
In fact, this fixes another small issue in the code in that
potential changes to a vCPU's xAPIC ID need not be tracked for
KVM_GET_LAPIC.
Fixes: 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base")
Signed-off-by: Yuan ZhaoXiong <yuanzhaoxiong@baidu.com>
Message-Id: <1669984574-32692-1-git-send-email-yuanzhaoxiong@baidu.com>
Cc: stable@vger.kernel.org
Reported-by: Alejandro Jimenez <alejandro.j.jimenez@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9cc409325ddd776f6fd6293d5ce93ce1248af6e4 upstream.
Inject #GP for if VMXON is attempting with a CR0/CR4 that fails the
generic "is CRx valid" check, but passes the CR4.VMXE check, and do the
generic checks _after_ handling the post-VMXON VM-Fail.
The CR4.VMXE check, and all other #UD cases, are special pre-conditions
that are enforced prior to pivoting on the current VMX mode, i.e. occur
before interception if VMXON is attempted in VMX non-root mode.
All other CR0/CR4 checks generate #GP and effectively have lower priority
than the post-VMXON check.
Per the SDM:
IF (register operand) or (CR0.PE = 0) or (CR4.VMXE = 0) or ...
THEN #UD;
ELSIF not in VMX operation
THEN
IF (CPL > 0) or (in A20M mode) or
(the values of CR0 and CR4 are not supported in VMX operation)
THEN #GP(0);
ELSIF in VMX non-root operation
THEN VMexit;
ELSIF CPL > 0
THEN #GP(0);
ELSE VMfail("VMXON executed in VMX root operation");
FI;
which, if re-written without ELSIF, yields:
IF (register operand) or (CR0.PE = 0) or (CR4.VMXE = 0) or ...
THEN #UD
IF in VMX non-root operation
THEN VMexit;
IF CPL > 0
THEN #GP(0)
IF in VMX operation
THEN VMfail("VMXON executed in VMX root operation");
IF (in A20M mode) or
(the values of CR0 and CR4 are not supported in VMX operation)
THEN #GP(0);
Note, KVM unconditionally forwards VMXON VM-Exits that occur in L2 to L1,
i.e. there is no need to check the vCPU is not in VMX non-root mode. Add
a comment to explain why unconditionally forwarding such exits is
functionally correct.
Reported-by: Eric Li <ercli@ucdavis.edu>
Fixes: c7d855c2aff2 ("KVM: nVMX: Inject #UD if VMXON is attempted with incompatible CR0/CR4")
Cc: stable@vger.kernel.org
Signed-off-by: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/r/20221006001956.329314-1-seanjc@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eb3992e833d3a17f9b0a3e0371d0b1d3d566f740 upstream.
Resume the guest immediately when injecting a #GP on ECREATE due to an
invalid enclave size, i.e. don't attempt ECREATE in the host. The #GP is
a terminal fault, e.g. skipping the instruction if ECREATE is successful
would result in KVM injecting #GP on the instruction following ECREATE.
Fixes: 70210c044b4e ("KVM: VMX: Add SGX ENCLS[ECREATE] handler to enforce CPUID restrictions")
Cc: stable@vger.kernel.org
Cc: Kai Huang <kai.huang@intel.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20220930233132.1723330-1-seanjc@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
If a triple fault was fixed by kvm_x86_ops.nested_ops->triple_fault (by
turning it into a vmexit), there is no need to leave vcpu_enter_guest().
Any vcpu->requests will be caught later before the actual vmentry,
and in fact vcpu_enter_guest() was not initializing the "r" variable.
Depending on the compiler's whims, this could cause the
x86_64/triple_fault_event_test test to fail.
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Fixes: 92e7d5c83aff ("KVM: x86: allow L1 to not intercept triple fault")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Fixes for Xen emulation. While nobody should be enabling it in
the kernel (the only public users of the feature are the selftests),
the bug effectively allows userspace to read arbitrary memory.
* Correctness fixes for nested hypervisors that do not intercept INIT
or SHUTDOWN on AMD; the subsequent CPU reset can cause a use-after-free
when it disables virtualization extensions. While downgrading the panic
to a WARN is quite easy, the full fix is a bit more laborious; there
are also tests. This is the bulk of the pull request.
* Fix race condition due to incorrect mmu_lock use around
make_mmu_pages_available().
Generic:
* Obey changes to the kvm.halt_poll_ns module parameter in VMs
not using KVM_CAP_HALT_POLL, restoring behavior from before
the introduction of the capability
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmODI84UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroPVJwgAombWOBf549JiHGPtwejuQO20nTSj
Om9pzWQ9dR182P+ju/FdqSPXt/Lc8i+z5zSXDrV3HQ6/a3zIItA+bOAUiMFvHNAQ
w/7pEb1MzVOsEg2SXGOjZvW3WouB4Z4R0PosInYjrFrRGRAaw5iaTOZHGezE44t2
WBWk1PpdMap7J/8sjNT1ble72ig9JdSW4qeJUQ1GWxHCigI5sESCQVqF446KM0jF
gTYPGX5TqpbWiIejF0yNew9yNKMi/yO4Pz8I5j3vtopeHx24DCIqUAGaEg6ykErX
vnzYbVP7NaFrqtje49PsK6i1cu2u7uFPArj0dxo3DviQVZVHV1q6tNmI4A==
=Qgei
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"x86:
- Fixes for Xen emulation. While nobody should be enabling it in the
kernel (the only public users of the feature are the selftests),
the bug effectively allows userspace to read arbitrary memory.
- Correctness fixes for nested hypervisors that do not intercept INIT
or SHUTDOWN on AMD; the subsequent CPU reset can cause a
use-after-free when it disables virtualization extensions. While
downgrading the panic to a WARN is quite easy, the full fix is a
bit more laborious; there are also tests. This is the bulk of the
pull request.
- Fix race condition due to incorrect mmu_lock use around
make_mmu_pages_available().
Generic:
- Obey changes to the kvm.halt_poll_ns module parameter in VMs not
using KVM_CAP_HALT_POLL, restoring behavior from before the
introduction of the capability"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: Update gfn_to_pfn_cache khva when it moves within the same page
KVM: x86/xen: Only do in-kernel acceleration of hypercalls for guest CPL0
KVM: x86/xen: Validate port number in SCHEDOP_poll
KVM: x86/mmu: Fix race condition in direct_page_fault
KVM: x86: remove exit_int_info warning in svm_handle_exit
KVM: selftests: add svm part to triple_fault_test
KVM: x86: allow L1 to not intercept triple fault
kvm: selftests: add svm nested shutdown test
KVM: selftests: move idt_entry to header
KVM: x86: forcibly leave nested mode on vCPU reset
KVM: x86: add kvm_leave_nested
KVM: x86: nSVM: harden svm_free_nested against freeing vmcb02 while still in use
KVM: x86: nSVM: leave nested mode on vCPU free
KVM: Obey kvm.halt_poll_ns in VMs not using KVM_CAP_HALT_POLL
KVM: Avoid re-reading kvm->max_halt_poll_ns during halt-polling
KVM: Cap vcpu->halt_poll_ns before halting rather than after
This brings in a few important fixes for Xen emulation.
While nobody should be enabling it, the bug effectively
allows userspace to read arbitrary memory.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are almost no hypercalls which are valid from CPL > 0, and definitely
none which are handled by the kernel.
Fixes: 2fd6df2f2b47 ("KVM: x86/xen: intercept EVTCHNOP_send from guests")
Reported-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Cc: stable@kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We shouldn't allow guests to poll on arbitrary port numbers off the end
of the event channel table.
Fixes: 1a65105a5aba ("KVM: x86/xen: handle PV spinlocks slowpath")
[dwmw2: my bug though; the original version did check the validity as a
side-effect of an idr_find() which I ripped out in refactoring.]
Reported-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Cc: stable@kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
make_mmu_pages_available() must be called with mmu_lock held for write.
However, if the TDP MMU is used, it will be called with mmu_lock held for
read.
This function does nothing unless shadow pages are used, so there is no
race unless nested TDP is used.
Since nested TDP uses shadow pages, old shadow pages may be zapped by this
function even when the TDP MMU is enabled.
Since shadow pages are never allocated by kvm_tdp_mmu_map(), a race
condition can be avoided by not calling make_mmu_pages_available() if the
TDP MMU is currently in use.
I encountered this when repeatedly starting and stopping nested VM.
It can be artificially caused by allocating a large number of nested TDP
SPTEs.
For example, the following BUG and general protection fault are caused in
the host kernel.
pte_list_remove: 00000000cd54fc10 many->many
------------[ cut here ]------------
kernel BUG at arch/x86/kvm/mmu/mmu.c:963!
invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:pte_list_remove.cold+0x16/0x48 [kvm]
Call Trace:
<TASK>
drop_spte+0xe0/0x180 [kvm]
mmu_page_zap_pte+0x4f/0x140 [kvm]
__kvm_mmu_prepare_zap_page+0x62/0x3e0 [kvm]
kvm_mmu_zap_oldest_mmu_pages+0x7d/0xf0 [kvm]
direct_page_fault+0x3cb/0x9b0 [kvm]
kvm_tdp_page_fault+0x2c/0xa0 [kvm]
kvm_mmu_page_fault+0x207/0x930 [kvm]
npf_interception+0x47/0xb0 [kvm_amd]
svm_invoke_exit_handler+0x13c/0x1a0 [kvm_amd]
svm_handle_exit+0xfc/0x2c0 [kvm_amd]
kvm_arch_vcpu_ioctl_run+0xa79/0x1780 [kvm]
kvm_vcpu_ioctl+0x29b/0x6f0 [kvm]
__x64_sys_ioctl+0x95/0xd0
do_syscall_64+0x5c/0x90
general protection fault, probably for non-canonical address
0xdead000000000122: 0000 [#1] PREEMPT SMP NOPTI
RIP: 0010:kvm_mmu_commit_zap_page.part.0+0x4b/0xe0 [kvm]
Call Trace:
<TASK>
kvm_mmu_zap_oldest_mmu_pages+0xae/0xf0 [kvm]
direct_page_fault+0x3cb/0x9b0 [kvm]
kvm_tdp_page_fault+0x2c/0xa0 [kvm]
kvm_mmu_page_fault+0x207/0x930 [kvm]
npf_interception+0x47/0xb0 [kvm_amd]
CVE: CVE-2022-45869
Fixes: a2855afc7ee8 ("KVM: x86/mmu: Allow parallel page faults for the TDP MMU")
Signed-off-by: Kazuki Takiguchi <takiguchi.kazuki171@gmail.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This fixes three issues in nested SVM:
1) in the shutdown_interception() vmexit handler we call kvm_vcpu_reset().
However, if running nested and L1 doesn't intercept shutdown, the function
resets vcpu->arch.hflags without properly leaving the nested state.
This leaves the vCPU in inconsistent state and later triggers a kernel
panic in SVM code. The same bug can likely be triggered by sending INIT
via local apic to a vCPU which runs a nested guest.
On VMX we are lucky that the issue can't happen because VMX always
intercepts triple faults, thus triple fault in L2 will always be
redirected to L1. Plus, handle_triple_fault() doesn't reset the vCPU.
INIT IPI can't happen on VMX either because INIT events are masked while
in VMX mode.
Secondarily, KVM doesn't honour SHUTDOWN intercept bit of L1 on SVM.
A normal hypervisor should always intercept SHUTDOWN, a unit test on
the other hand might want to not do so.
Finally, the guest can trigger a kernel non rate limited printk on SVM
from the guest, which is fixed as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is valid to receive external interrupt and have broken IDT entry,
which will lead to #GP with exit_int_into that will contain the index of
the IDT entry (e.g any value).
Other exceptions can happen as well, like #NP or #SS
(if stack switch fails).
Thus this warning can be user triggred and has very little value.
Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221103141351.50662-10-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is SVM correctness fix - although a sane L1 would intercept
SHUTDOWN event, it doesn't have to, so we have to honour this.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221103141351.50662-8-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While not obivous, kvm_vcpu_reset() leaves the nested mode by clearing
'vcpu->arch.hflags' but it does so without all the required housekeeping.
On SVM, it is possible to have a vCPU reset while in guest mode because
unlike VMX, on SVM, INIT's are not latched in SVM non root mode and in
addition to that L1 doesn't have to intercept triple fault, which should
also trigger L1's reset if happens in L2 while L1 didn't intercept it.
If one of the above conditions happen, KVM will continue to use vmcb02
while not having in the guest mode.
Later the IA32_EFER will be cleared which will lead to freeing of the
nested guest state which will (correctly) free the vmcb02, but since
KVM still uses it (incorrectly) this will lead to a use after free
and kernel crash.
This issue is assigned CVE-2022-3344
Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221103141351.50662-5-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
add kvm_leave_nested which wraps a call to nested_ops->leave_nested
into a function.
Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221103141351.50662-4-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make sure that KVM uses vmcb01 before freeing nested state, and warn if
that is not the case.
This is a minimal fix for CVE-2022-3344 making the kernel print a warning
instead of a kernel panic.
Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221103141351.50662-3-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the VM was terminated while nested, we free the nested state
while the vCPU still is in nested mode.
Soon a warning will be added for this condition.
Cc: stable@vger.kernel.org
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221103141351.50662-2-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DE_CFG contains the LFENCE serializing bit, restore it on resume too.
This is relevant to older families due to the way how they do S3.
Unify and correct naming while at it.
Fixes: e4d0e84e4907 ("x86/cpu/AMD: Make LFENCE a serializing instruction")
Reported-by: Andrew Cooper <Andrew.Cooper3@citrix.com>
Reported-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
When zapping a GFN range, pass 0 => ALL_ONES for the to-be-invalidated
range to effectively block all page faults while the zap is in-progress.
The invalidation helpers take a host virtual address, whereas zapping a
GFN obviously provides a guest physical address and with the wrong unit
of measurement (frame vs. byte).
Alternatively, KVM could walk all memslots to get the associated HVAs,
but thanks to SMM, that would require multiple lookups. And practically
speaking, kvm_zap_gfn_range() usage is quite rare and not a hot path,
e.g. MTRR and CR0.CD are almost guaranteed to be done only on vCPU0
during boot, and APICv inhibits are similarly infrequent operations.
Fixes: edb298c663fc ("KVM: x86/mmu: bump mmu notifier count in kvm_zap_gfn_range")
Reported-by: Chao Peng <chao.p.peng@linux.intel.com>
Cc: stable@vger.kernel.org
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221111001841.2412598-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The AMD PerfMonV2 specification allows for a maximum of 16 GP counters,
but currently only 6 pairs of MSRs are accepted by KVM.
While AMD64_NUM_COUNTERS_CORE is already equal to 6, increasing without
adjusting msrs_to_save_all[] could result in out-of-bounds accesses.
Therefore introduce a macro (named KVM_AMD_PMC_MAX_GENERIC) to
refer to the number of counters supported by KVM.
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20220919091008.60695-3-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Intel Architectural IA32_PMCx MSRs addresses range allows for a
maximum of 8 GP counters, and KVM cannot address any more. Introduce a
local macro (named KVM_INTEL_PMC_MAX_GENERIC) and use it consistently to
refer to the number of counters supported by KVM, thus avoiding possible
out-of-bound accesses.
Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20220919091008.60695-2-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The SDM lists an architectural MSR IA32_CORE_CAPABILITIES (0xCF)
that limits the theoretical maximum value of the Intel GP PMC MSRs
allocated at 0xC1 to 14; likewise the Intel April 2022 SDM adds
IA32_OVERCLOCKING_STATUS at 0x195 which limits the number of event
selection MSRs to 15 (0x186-0x194).
Limiting the maximum number of counters to 14 or 18 based on the currently
allocated MSRs is clearly fragile, and it seems likely that Intel will
even place PMCs 8-15 at a completely different range of MSR indices.
So stop at the maximum number of GP PMCs supported today on Intel
processors.
There are some machines, like Intel P4 with non Architectural PMU, that
may indeed have 18 counters, but those counters are in a completely
different MSR address range and are not supported by KVM.
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: stable@vger.kernel.org
Fixes: cf05a67b68b8 ("KVM: x86: omit "impossible" pmu MSRs from MSR list")
Suggested-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20220919091008.60695-1-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Explicitly print the VMSA dump at KERN_DEBUG log level, KERN_CONT uses
KERNEL_DEFAULT if the previous log line has a newline, i.e. if there's
nothing to continuing, and as a result the VMSA gets dumped when it
shouldn't.
The KERN_CONT documentation says it defaults back to KERNL_DEFAULT if the
previous log line has a newline. So switch from KERN_CONT to
print_hex_dump_debug().
Jarkko pointed this out in reference to the original patch. See:
https://lore.kernel.org/all/YuPMeWX4uuR1Tz3M@kernel.org/
print_hex_dump(KERN_DEBUG, ...) was pointed out there, but
print_hex_dump_debug() should similar.
Fixes: 6fac42f127b8 ("KVM: SVM: Dump Virtual Machine Save Area (VMSA) to klog")
Signed-off-by: Peter Gonda <pgonda@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Cc: Jarkko Sakkinen <jarkko@kernel.org>
Cc: Harald Hoyer <harald@profian.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: x86@kernel.org
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: kvm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: stable@vger.kernel.org
Message-Id: <20221104142220.469452-1-pgonda@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86_virt_spec_ctrl only deals with the paravirtualized
MSR_IA32_VIRT_SPEC_CTRL now and does not handle MSR_IA32_SPEC_CTRL
anymore; remove the corresponding, unused argument.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Allow access to the percpu area via the GS segment base, which is
needed in order to access the saved host spec_ctrl value. In linux-next
FILL_RETURN_BUFFER also needs to access percpu data.
For simplicity, the physical address of the save area is added to struct
svm_cpu_data.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Analyzed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is error-prone that code after vmexit cannot access percpu data
because GSBASE has not been restored yet. It forces MSR_IA32_SPEC_CTRL
save/restore to happen very late, after the predictor untraining
sequence, and it gets in the way of return stack depth tracking
(a retbleed mitigation that is in linux-next as of 2022-11-09).
As a first step towards fixing that, move the VMCB VMSAVE/VMLOAD to
assembly, essentially undoing commit fb0c4a4fee5a ("KVM: SVM: move
VMLOAD/VMSAVE to C code", 2021-03-15). The reason for that commit was
that it made it simpler to use a different VMCB for VMLOAD/VMSAVE versus
VMRUN; but that is not a big hassle anymore thanks to the kvm-asm-offsets
machinery and other related cleanups.
The idea on how to number the exception tables is stolen from
a prototype patch by Peter Zijlstra.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Link: <https://lore.kernel.org/all/f571e404-e625-bae1-10e9-449b2eb4cbd8@citrix.com/>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The svm_data percpu variable is a pointer, but it is allocated via
svm_hardware_setup() when KVM is loaded. Unlike hardware_enable()
this means that it is never NULL for the whole lifetime of KVM, and
static allocation does not waste any memory compared to the status quo.
It is also more efficient and more easily handled from assembly code,
so do it and don't look back.
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The "cpu" field of struct svm_cpu_data has been write-only since commit
4b656b120249 ("KVM: SVM: force new asid on vcpu migration", 2009-08-05).
Remove it.
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The pointer to svm_cpu_data in struct vcpu_svm looks interesting from
the point of view of accessing it after vmexit, when the GSBASE is still
containing the guest value. However, despite existing since the very
first commit of drivers/kvm/svm.c (commit 6aa8b732ca01, "[PATCH] kvm:
userspace interface", 2006-12-10), it was never set to anything.
Ignore the opportunity to fix a 16 year old "bug" and delete it; doing
things the "harder" way makes it possible to remove more old cruft.
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Continue moving accesses to struct vcpu_svm to vmenter.S. Reducing the
number of arguments limits the chance of mistakes due to different
registers used for argument passing in 32- and 64-bit ABIs; pushing the
VMCB argument and almost immediately popping it into a different
register looks pretty weird.
32-bit ABI is not a concern for __svm_sev_es_vcpu_run() which is 64-bit
only; however, it will soon need @svm to save/restore SPEC_CTRL so stay
consistent with __svm_vcpu_run() and let them share the same prototype.
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
32-bit ABI uses RAX/RCX/RDX as its argument registers, so they are in
the way of instructions that hardcode their operands such as RDMSR/WRMSR
or VMLOAD/VMRUN/VMSAVE.
In preparation for moving vmload/vmsave to __svm_vcpu_run(), keep
the pointer to the struct vcpu_svm in %rdi. In particular, it is now
possible to load svm->vmcb01.pa in %rax without clobbering the struct
vcpu_svm pointer.
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since registers are reachable through vcpu_svm, and we will
need to access more fields of that struct, pass it instead
of the regs[] array.
No functional change intended.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This already removes an ugly #include "" from asm-offsets.c, but
especially it avoids a future error when trying to define asm-offsets
for KVM's svm/svm.h header.
This would not work for kernel/asm-offsets.c, because svm/svm.h
includes kvm_cache_regs.h which is not in the include path when
compiling asm-offsets.c. The problem is not there if the .c file is
in arch/x86/kvm.
Suggested-by: Sean Christopherson <seanjc@google.com>
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Swap the 1st and 2nd arguments to be consistent with the usage of
kvcalloc().
Fixes: c9b8fecddb5b ("KVM: use kvcalloc for array allocations")
Signed-off-by: Liao Chang <liaochang1@huawei.com>
Message-Id: <20221103011749.139262-1-liaochang1@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
kvm_zap_gfn_range() must be called in an SRCU read-critical section, but
there is no SRCU annotation in __kvm_set_or_clear_apicv_inhibit(). This
can lead to the following warning via
kvm_arch_vcpu_ioctl_set_guest_debug() if a Shadow MMU is in use (TDP
MMU disabled or nesting):
[ 1416.659809] =============================
[ 1416.659810] WARNING: suspicious RCU usage
[ 1416.659839] 6.1.0-dbg-DEV #1 Tainted: G S I
[ 1416.659853] -----------------------------
[ 1416.659854] include/linux/kvm_host.h:954 suspicious rcu_dereference_check() usage!
[ 1416.659856]
...
[ 1416.659904] dump_stack_lvl+0x84/0xaa
[ 1416.659910] dump_stack+0x10/0x15
[ 1416.659913] lockdep_rcu_suspicious+0x11e/0x130
[ 1416.659919] kvm_zap_gfn_range+0x226/0x5e0
[ 1416.659926] ? kvm_make_all_cpus_request_except+0x18b/0x1e0
[ 1416.659935] __kvm_set_or_clear_apicv_inhibit+0xcc/0x100
[ 1416.659940] kvm_arch_vcpu_ioctl_set_guest_debug+0x350/0x390
[ 1416.659946] kvm_vcpu_ioctl+0x2fc/0x620
[ 1416.659955] __se_sys_ioctl+0x77/0xc0
[ 1416.659962] __x64_sys_ioctl+0x1d/0x20
[ 1416.659965] do_syscall_64+0x3d/0x80
[ 1416.659969] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Always take the KVM SRCU read lock in __kvm_set_or_clear_apicv_inhibit()
to protect the GFN to memslot translation. The SRCU read lock is not
technically required when no Shadow MMUs are in use, since the TDP MMU
walks the paging structures from the roots and does not need to look up
GFN translations in the memslots, but make the SRCU locking
unconditional for simplicty.
In most cases, the SRCU locking is taken care of in the vCPU run loop,
but when called through other ioctls (such as KVM_SET_GUEST_DEBUG)
there is no srcu_read_lock.
Tested: ran tools/testing/selftests/kvm/x86_64/debug_regs on a DBG
build. This patch causes the suspicious RCU warning to disappear.
Note that the warning is hit in __kvm_zap_rmaps(), so
kvm_memslots_have_rmaps() must return true in order for this to
repro (i.e. the TDP MMU must be off or nesting in use.)
Reported-by: Greg Thelen <gthelen@google.com>
Fixes: 36222b117e36 ("KVM: x86: don't disable APICv memslot when inhibited")
Signed-off-by: Ben Gardon <bgardon@google.com>
Message-Id: <20221102205359.1260980-1-bgardon@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Ignore guest CPUID for host userspace writes to the DEBUGCTL MSR, KVM's
ABI is that setting CPUID vs. state can be done in any order, i.e. KVM
allows userspace to stuff MSRs prior to setting the guest's CPUID that
makes the new MSR "legal".
Keep the vmx_get_perf_capabilities() check for guest writes, even though
it's technically unnecessary since the vCPU's PERF_CAPABILITIES is
consulted when refreshing LBR support. A future patch will clean up
vmx_get_perf_capabilities() to avoid the RDMSR on every call, at which
point the paranoia will incur no meaningful overhead.
Note, prior to vmx_get_perf_capabilities() checking that the host fully
supports LBRs via x86_perf_get_lbr(), KVM effectively relied on
intel_pmu_lbr_is_enabled() to guard against host userspace enabling LBRs
on platforms without full support.
Fixes: c646236344e9 ("KVM: vmx/pmu: Add PMU_CAP_LBR_FMT check when guest LBR is enabled")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221006000314.73240-5-seanjc@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fold vmx_supported_debugctl() into vcpu_supported_debugctl(), its only
caller. Setting bits only to clear them a few instructions later is
rather silly, and splitting the logic makes things seem more complicated
than they actually are.
Opportunistically drop DEBUGCTLMSR_LBR_MASK now that there's a single
reference to the pair of bits. The extra layer of indirection provides
no meaningful value and makes it unnecessarily tedious to understand
what KVM is doing.
No functional change.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221006000314.73240-4-seanjc@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Advertise LBR support to userspace via MSR_IA32_PERF_CAPABILITIES if and
only if perf fully supports LBRs. Perf may disable LBRs (by zeroing the
number of LBRs) even on platforms the allegedly support LBRs, e.g. if
probing any LBR MSRs during setup fails.
Fixes: be635e34c284 ("KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES")
Reported-by: Like Xu <like.xu.linux@gmail.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221006000314.73240-3-seanjc@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Should not call eventfd_ctx_put() in case of error.
Fixes: 2fd6df2f2b47 ("KVM: x86/xen: intercept EVTCHNOP_send from guests")
Reported-by: syzbot+6f0c896c5a9449a10ded@syzkaller.appspotmail.com
Signed-off-by: Eiichi Tsukata <eiichi.tsukata@nutanix.com>
Message-Id: <20221028092631.117438-1-eiichi.tsukata@nutanix.com>
[Introduce new goto target instead. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On 64 bit host, if the guest doesn't have X86_FEATURE_LM, KVM will
access 16 gprs to 32-bit smram image, causing out-ouf-bound ram
access.
On 32 bit host, the rsm_load_state_64/enter_smm_save_state_64
is compiled out, thus access overflow can't happen.
Fixes: b443183a25ab61 ("KVM: x86: Reduce the number of emulator GPRs to '8' for 32-bit KVM")
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221025124741.228045-15-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Update the emulation mode when handling writes to CR0, because
toggling CR0.PE switches between Real and Protected Mode, and toggling
CR0.PG when EFER.LME=1 switches between Long and Protected Mode.
This is likely a benign bug because there is no writeback of state,
other than the RIP increment, and when toggling CR0.PE, the CPU has
to execute code from a very low memory address.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221025124741.228045-14-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Update the emulation mode after RSM so that RIP will be correctly
written back, because the RSM instruction can switch the CPU mode from
32 bit (or less) to 64 bit.
This fixes a guest crash in case the #SMI is received while the guest
runs a code from an address > 32 bit.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221025124741.228045-13-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some instructions update the cpu execution mode, which needs to update the
emulation mode.
Extract this code, and make assign_eip_far use it.
assign_eip_far now reads CS, instead of getting it via a parameter,
which is ok, because callers always assign CS to the same value
before calling this function.
No functional change is intended.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221025124741.228045-12-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SYSEXIT is one of the instructions that can change the
processor mode, thus ctxt->mode should be updated after it.
Note that this is likely a benign bug, because the only problematic
mode change is from 32 bit to 64 bit which can lead to truncation of RIP,
and it is not possible to do with sysexit,
since sysexit running in 32 bit mode will be limited to 32 bit version.
Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20221025124741.228045-11-mlevitsk@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the gfn_to_pfn_cache lock initialization to another helper and
call the new helper during VM/vCPU creation. There are race
conditions possible due to kvm_gfn_to_pfn_cache_init()'s
ability to re-initialize the cache's locks.
For example: a race between ioctl(KVM_XEN_HVM_EVTCHN_SEND) and
kvm_gfn_to_pfn_cache_init() leads to a corrupted shinfo gpc lock.
(thread 1) | (thread 2)
|
kvm_xen_set_evtchn_fast |
read_lock_irqsave(&gpc->lock, ...) |
| kvm_gfn_to_pfn_cache_init
| rwlock_init(&gpc->lock)
read_unlock_irqrestore(&gpc->lock, ...) |
Rename "cache_init" and "cache_destroy" to activate+deactivate to
avoid implying that the cache really is destroyed/freed.
Note, there more races in the newly named kvm_gpc_activate() that will
be addressed separately.
Fixes: 982ed0de4753 ("KVM: Reinstate gfn_to_pfn_cache with invalidation support")
Cc: stable@vger.kernel.org
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
[sean: call out that this is a bug fix]
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20221013211234.1318131-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Clear enable_sgx if ENCLS-exiting is not supported, i.e. if SGX cannot be
virtualized. When KVM is loaded, adjust_vmx_controls checks that the
bit is available before enabling the feature; however, other parts of the
code check enable_sgx and not clearing the variable caused two different
bugs, mostly affecting nested virtualization scenarios.
First, because enable_sgx remained true, SECONDARY_EXEC_ENCLS_EXITING
would be marked available in the capability MSR that are accessed by a
nested hypervisor. KVM would then propagate the control from vmcs12
to vmcs02 even if it isn't supported by the processor, thus causing an
unexpected VM-Fail (exit code 0x7) in L1.
Second, vmx_set_cpu_caps() would not clear the SGX bits when hardware
support is unavailable. This is a much less problematic bug as it only
happens if SGX is soft-disabled (available in the processor but hidden
in CPUID) or if SGX is supported for bare metal but not in the VMCS
(will never happen when running on bare metal, but can theoertically
happen when running in a VM).
Last but not least, this ensures that module params in sysfs reflect
KVM's actual configuration.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=2127128
Fixes: 72add915fbd5 ("KVM: VMX: Enable SGX virtualization for SGX1, SGX2 and LC")
Cc: stable@vger.kernel.org
Suggested-by: Sean Christopherson <seanjc@google.com>
Suggested-by: Bandan Das <bsd@redhat.com>
Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20221025123749.2201649-1-eesposit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>