linux

iv/linux

Author	SHA1	Message	Date
Jim Mattson	aef46a6476	KVM: x86: VMX: Replace some Intel model numbers with mnemonics Intel processor code names are more familiar to many readers than their decimal model numbers. Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-29-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:57 -04:00
Sean Christopherson	64f80ea73b	KVM: VMX: Adjust CR3/INVPLG interception for EPT=y at runtime, not setup Clear the CR3 and INVLPG interception controls at runtime based on whether or not EPT is being _used_, as opposed to clearing the bits at setup if EPT is _supported_ in hardware, and then restoring them when EPT is not used. Not mucking with the base config will allow using the base config as the starting point for emulating the VMX capability MSRs. Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-28-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:57 -04:00
Vitaly Kuznetsov	a83bea73fa	KVM: VMX: Add missing CPU based VM execution controls to vmcs_config As a preparation to reusing the result of setup_vmcs_config() in nested VMX MSR setup, add the CPU based VM execution controls which KVM doesn't use but supports for nVMX to KVM_OPT_VMX_CPU_BASED_VM_EXEC_CONTROL and filter them out in vmx_exec_control(). No functional change intended. Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-27-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:56 -04:00
Vitaly Kuznetsov	f16e47429e	KVM: VMX: Add missing VMEXIT controls to vmcs_config As a preparation to reusing the result of setup_vmcs_config() in nested VMX MSR setup, add the VMEXIT controls which KVM doesn't use but supports for nVMX to KVM_OPT_VMX_VM_EXIT_CONTROLS and filter them out in vmx_vmexit_ctrl(). No functional change intended. Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-26-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:55 -04:00
Vitaly Kuznetsov	e89e1e2302	KVM: VMX: Move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering out of setup_vmcs_config() As a preparation to reusing the result of setup_vmcs_config() in nested VMX MSR setup, move CPU_BASED_CR8_{LOAD,STORE}_EXITING filtering to vmx_exec_control(). No functional change intended. Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-25-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:55 -04:00
Vitaly Kuznetsov	ee087b4da0	KVM: VMX: Extend VMX controls macro shenanigans When VMX controls macros are used to set or clear a control bit, make sure that this bit was checked in setup_vmcs_config() and thus is properly reflected in vmcs_config. Opportunistically drop pointless "< 0" check for adjust_vmx_controls()'s return value. No functional change intended. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-24-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:54 -04:00
Sean Christopherson	ebb3c8d409	KVM: VMX: Don't toggle VM_ENTRY_IA32E_MODE for 32-bit kernels/KVM Don't toggle VM_ENTRY_IA32E_MODE in 32-bit kernels/KVM and instead bug the VM if KVM attempts to run the guest with EFER.LMA=1. KVM doesn't support running 64-bit guests with 32-bit hosts. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-23-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:53 -04:00
Vitaly Kuznetsov	1dae276569	KVM: VMX: Tweak the special handling of SECONDARY_EXEC_ENCLS_EXITING in setup_vmcs_config() SECONDARY_EXEC_ENCLS_EXITING is the only control which is conditionally added to the 'optional' checklist in setup_vmcs_config() but the special case can be avoided by always checking for its presence first and filtering out the result later. Note: the situation when SECONDARY_EXEC_ENCLS_EXITING is present but cpu_has_sgx() is false is possible when SGX is "soft-disabled", e.g. if software writes MCE control MSRs or there's an uncorrectable #MC. Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-22-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:52 -04:00
Vitaly Kuznetsov	378c4c1850	KVM: VMX: Check CPU_BASED_{INTR,NMI}_WINDOW_EXITING in setup_vmcs_config() CPU_BASED_{INTR,NMI}_WINDOW_EXITING controls are toggled dynamically by vmx_enable_{irq,nmi}_window, handle_interrupt_window(), handle_nmi_window() but setup_vmcs_config() doesn't check their existence. Add the check and filter the controls out in vmx_exec_control(). Note: KVM explicitly supports CPUs without VIRTUAL_NMIS and all these CPUs are supposedly lacking NMI_WINDOW_EXITING too. Adjust cpu_has_virtual_nmis() accordingly. No functional change intended. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-21-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:51 -04:00
Vitaly Kuznetsov	ffaaf5913f	KVM: VMX: Check VM_ENTRY_IA32E_MODE in setup_vmcs_config() VM_ENTRY_IA32E_MODE control is toggled dynamically by vmx_set_efer() and setup_vmcs_config() doesn't check its existence. On the contrary, nested_vmx_setup_ctls_msrs() doesn set it on x86_64. Add the missing check and filter the bit out in vmx_vmentry_ctrl(). No (real) functional change intended as all existing CPUs supporting long mode and VMX are supposed to have it. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Jim Mattson <jmattson@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-20-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:51 -04:00
Sean Christopherson	f4c93d1a0e	KVM: nVMX: Always emulate PERF_GLOBAL_CTRL VM-Entry/VM-Exit controls Advertise VM_{ENTRY,EXIT}_LOAD_IA32_PERF_GLOBAL_CTRL as being supported for nested VMs irrespective of hardware support. KVM fully emulates the controls, i.e. manually emulates MSR writes on entry/exit, and never propagates the guest settings directly to vmcs02. In addition to allowing L1 VMMs to use the controls on older hardware, unconditionally advertising the controls will also allow KVM to use its vmcs01 configuration as the basis for the nested VMX configuration without causing a regression (due the errata which causes KVM to "hide" the control from vmcs01 but not vmcs12). Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-19-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:50 -04:00
Sean Christopherson	def9d705c0	KVM: nVMX: Don't propagate vmcs12's PERF_GLOBAL_CTRL settings to vmcs02 Don't propagate vmcs12's VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL to vmcs02. KVM doesn't disallow L1 from using VM_ENTRY_LOAD_IA32_PERF_GLOBAL_CTRL even when KVM itself doesn't use the control, e.g. due to the various CPU errata that where the MSR can be corrupted on VM-Exit. Preserve KVM's (vmcs01) setting to hopefully avoid having to toggle the bit in vmcs02 at a later point. E.g. if KVM is loading PERF_GLOBAL_CTRL when running L1, then odds are good KVM will also load the MSR when running L2. Fixes: 8bf00a529967 ("KVM: VMX: add support for switching of PERF_GLOBAL_CTRL") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20220830133737.1539624-18-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:49 -04:00
Vitaly Kuznetsov	9bcb90650e	KVM: VMX: Get rid of eVMCS specific VMX controls sanitization With the updated eVMCSv1 definition, there's no known 'problematic' controls which are exposed in VMX control MSRs but are not present in eVMCSv1: all known Hyper-V versions either don't expose the new fields by not setting bits in the VMX feature controls or support the new eVMCS revision. Get rid of VMX control MSRs filtering for KVM on Hyper-V. Note: VMX control MSRs filtering for Hyper-V on KVM (nested_evmcs_filter_control_msr()) stays as even the updated eVMCSv1 definition doesn't have all the features implemented by KVM and some fields are still missing. Moreover, nested_evmcs_filter_control_msr() has to support the original eVMCSv1 version when VMM wishes so. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-17-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:48 -04:00
Vitaly Kuznetsov	4da77090b0	KVM: nVMX: Support PERF_GLOBAL_CTRL with enlightened VMCS Enlightened VMCS v1 got updated and now includes the required fields for loading PERF_GLOBAL_CTRL upon VMENTER/VMEXIT features. For KVM on Hyper-V enablement, KVM can just observe VMX control MSRs and use the features (with or without eVMCS) when possible. Hyper-V on KVM is messier as Windows 11 guests fail to boot if the controls are advertised and a new PV feature flag, CPUID.0x4000000A.EBX BIT(0), is not set. Honor the Hyper-V CPUID feature flag to play nice with Windows guests. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20220830133737.1539624-16-vkuznets@redhat.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:47 -04:00
Sean Christopherson	3ff8a13d41	KVM: nVMX: WARN once and fail VM-Enter if eVMCS sees VMFUNC[63:32] != 0 WARN and reject nested VM-Enter if KVM is using eVMCS and manages to allow a non-zero value in the upper 32 bits of VM-function controls. The eVMCS code assumes all inputs are 32-bit values and subtly drops the upper bits. WARN instead of adding proper "support", it's unlikely the upper bits will be defined/used in the next decade. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-15-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:47 -04:00
Vitaly Kuznetsov	dea6e140d9	KVM: x86: hyper-v: Cache HYPERV_CPUID_NESTED_FEATURES CPUID leaf KVM has to check guest visible HYPERV_CPUID_NESTED_FEATURES.EBX CPUID leaf to know which Enlightened VMCS definition to use (original or 2022 update). Cache the leaf along with other Hyper-V CPUID feature leaves to make the check quick. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-12-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:44 -04:00
Vitaly Kuznetsov	c9d31986e8	KVM: nVMX: Support several new fields in eVMCSv1 Enlightened VMCS v1 definition was updated with new fields, add support for them for Hyper-V on KVM. Note: SSP, CET and Guest LBR features are not supported by KVM yet and 'struct vmcs12' has no corresponding fields. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-11-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:43 -04:00
Vitaly Kuznetsov	b19e4ff5e5	KVM: VMX: Define VMCS-to-EVMCS conversion for the new fields Enlightened VMCS v1 definition was updated with new fields, support them in KVM by defining VMCS-to-EVMCS conversion. Note: SSP, CET and Guest LBR features are not supported by KVM yet and the corresponding fields are not defined in 'enum vmcs_field', leave them commented out for now. Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-10-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:42 -04:00
Sean Christopherson	6cce93de28	KVM: nVMX: Use CC() macro to handle eVMCS unsupported controls checks Locally #define and use the nested virtualization Consistency Check (CC) macro to handle eVMCS unsupported controls checks. Using the macro loses the existing printing of the unsupported controls, but that's a feature and not a bug. The existing approach is flawed because the @err param to trace_kvm_nested_vmenter_failed() is the error code, not the error value. The eVMCS trickery mostly works as __print_symbolic() falls back to printing the raw hex value, but that subtly relies on not having a match between the unsupported value and VMX_VMENTER_INSTRUCTION_ERRORS. If it's really truly necessary to snapshot the bad value, then the tracepoint can be extended in the future. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-9-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:42 -04:00
Vitaly Kuznetsov	f4d361b4c2	KVM: nVMX: Refactor unsupported eVMCS controls logic to use 2-d array Refactor the handling of unsupported eVMCS to use a 2-d array to store the set of unsupported controls. KVM's handling of eVMCS is completely broken as there is no way for userspace to query which features are unsupported, nor does KVM prevent userspace from attempting to enable unsupported features. A future commit will remedy that by filtering and enforcing unsupported features when eVMCS, but that needs to be opt-in from userspace to avoid breakage, i.e. KVM needs to maintain its legacy behavior by snapshotting the exact set of controls that are currently (un)supported by eVMCS. No functional change intended. Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> [sean: split to standalone patch, write changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-8-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:41 -04:00
Sean Christopherson	85ab071af8	KVM: nVMX: Treat eVMCS as enabled for guest iff Hyper-V is also enabled When querying whether or not eVMCS is enabled on behalf of the guest, treat eVMCS as enable if and only if Hyper-V is enabled/exposed to the guest. Note, flows that come from the host, e.g. KVM_SET_NESTED_STATE, must NOT check for Hyper-V being enabled as KVM doesn't require guest CPUID to be set before most ioctls(). Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-7-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:40 -04:00
Sean Christopherson	3be29eb7b5	KVM: x86: Report error when setting CPUID if Hyper-V allocation fails Return -ENOMEM back to userspace if allocating the Hyper-V vCPU struct fails when enabling Hyper-V in guest CPUID. Silently ignoring failure means that KVM will not have an up-to-date CPUID cache if allocating the struct succeeds later on, e.g. when activating SynIC. Rejecting the CPUID operation also guarantess that vcpu->arch.hyperv is non-NULL if hyperv_enabled is true, which will allow for additional cleanup, e.g. in the eVMCS code. Note, the initialization needs to be done before CPUID is set, and more subtly before kvm_check_cpuid(), which potentially enables dynamic XFEATURES. Sadly, there's no easy way to avoid exposing Hyper-V details to CPUID or vice versa. Expose kvm_hv_vcpu_init() and the Hyper-V CPUID signature to CPUID instead of exposing cpuid_entry2_find() outside of CPUID code. It's hard to envision kvm_hv_vcpu_init() being misused, whereas cpuid_entry2_find() absolutely shouldn't be used outside of core CPUID code. Fixes: 10d7bf1e46dc ("KVM: x86: hyper-v: Cache guest CPUID leaves determining features availability") Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-6-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:39 -04:00
Sean Christopherson	1cac8d9f6b	KVM: x86: Check for existing Hyper-V vCPU in kvm_hv_vcpu_init() When potentially allocating/initializing the Hyper-V vCPU struct, check for an existing instance in kvm_hv_vcpu_init() instead of requiring callers to perform the check. Relying on callers to do the check is risky as it's all too easy for KVM to overwrite vcpu->arch.hyperv and leak memory, and it adds additional burden on callers without much benefit. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20220830133737.1539624-5-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:38 -04:00
Vitaly Kuznetsov	ce2196b831	KVM: x86: Zero out entire Hyper-V CPUID cache before processing entries Wipe the whole 'hv_vcpu->cpuid_cache' with memset() instead of having to zero each particular member when the corresponding CPUID entry was not found. No functional change intended. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> [sean: split to separate patch] Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20220830133737.1539624-4-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:38 -04:00
Vitaly Kuznetsov	5ef384a60f	x86/hyperv: Update 'struct hv_enlightened_vmcs' definition Updated Hyper-V Enlightened VMCS specification lists several new fields for the following features: - PerfGlobalCtrl - EnclsExitingBitmap - Tsc Scaling - GuestLbrCtl - CET - SSP Update the definition. Note, the updated spec also provides an additional CPUID feature flag, CPUIDD.0x4000000A.EBX BIT(0), for PerfGlobalCtrl to workaround a Windows 11 quirk. Despite what the TLFS says: Indicates support for the GuestPerfGlobalCtrl and HostPerfGlobalCtrl fields in the enlightened VMCS. guests can safely use the fields if they are enumerated in the architectural VMX MSRs. I.e. KVM-on-HyperV doesn't need to check the CPUID bit, but KVM-as-HyperV must ensure the bit is set if PerfGlobalCtrl fields are exposed to L1. https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/tlfs/tlfs Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> [sean: tweak CPUID name to make it PerfGlobalCtrl only] Signed-off-by: Sean Christopherson <seanjc@google.com> Acked-by: Wei Liu <wei.liu@kernel.org> Link: https://lore.kernel.org/r/20220830133737.1539624-3-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:37 -04:00
Vitaly Kuznetsov	ea9da788a6	x86/hyperv: Fix 'struct hv_enlightened_vmcs' definition Section 1.9 of TLFS v6.0b says: "All structures are padded in such a way that fields are aligned naturally (that is, an 8-byte field is aligned to an offset of 8 bytes and so on)". 'struct enlightened_vmcs' has a glitch: ... struct { u32 nested_flush_hypercall:1; /* 836: 0 4 / u32 msr_bitmap:1; / 836: 1 4 / u32 reserved:30; / 836: 2 4 / } hv_enlightenments_control; / 836 4 / u32 hv_vp_id; / 840 4 / u64 hv_vm_id; / 844 8 / u64 partition_assist_page; / 852 8 */ ... And the observed values in 'partition_assist_page' make no sense at all. Fix the layout by padding the structure properly. Fixes: 68d1eb72ee99 ("x86/hyper-v: define struct hv_enlightened_vmcs and clean field bits") Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Michael Kelley <mikelley@microsoft.com> Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20220830133737.1539624-2-vkuznets@redhat.com Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:36 -04:00
Uros Bizjak	57abfa11ba	KVM: VMX: Do not declare vmread_error() asmlinkage There is no need to declare vmread_error() asmlinkage, its arguments can be passed via registers for both 32-bit and 64-bit targets. Function argument registers are considered call-clobbered registers, they are saved in the trampoline just before the function call and restored afterwards. Dropping "asmlinkage" patch unifies trampoline function argument handling between 32-bit and 64-bit targets and improves generated code for 32-bit targets. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20220817144045.3206-1-ubizjak@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:35 -04:00
Liam Ni	e390f4d69d	KVM:x86: Clean up ModR/M "reg" initialization in reg op decoding Refactor decode_register_operand() to get the ModR/M register if and only if the instruction uses a ModR/M encoding to make it more obvious how the register operand is retrieved. Signed-off-by: Liam Ni <zhiguangni01@gmail.com> Link: https://lore.kernel.org/r/20220908141210.1375828-1-zhiguangni01@zhaoxin.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:34 -04:00
Mingwei Zhang	02dfc44f20	KVM: x86: Print guest pgd in kvm_nested_vmenter() Print guest pgd in kvm_nested_vmenter() to enrich the information for tracing. When tdp is enabled, print the value of tdp page table (EPT/NPT); when tdp is disabled, print the value of non-nested CR3. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Mingwei Zhang <mizhang@google.com> Link: https://lore.kernel.org/r/20220825225755.907001-4-mizhang@google.com [sean: print nested_cr3 vs. nested_eptp vs. guest_cr3] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:33 -04:00
David Matlack	37ef0be269	KVM: nVMX: Add tracepoint for nested VM-Enter Call trace_kvm_nested_vmenter() during nested VMLAUNCH/VMRESUME to bring parity with nSVM's usage of the tracepoint during nested VMRUN. Attempt to use analagous VMCS fields to the VMCB fields that are reported in the SVM case: "int_ctl": 32-bit field of the VMCB that the CPU uses to deliver virtual interrupts. The analagous VMCS field is the 16-bit "guest interrupt status". "event_inj": 32-bit field of VMCB that is used to inject events (exceptions and interrupts) into the guest. The analagous VMCS field is the "VM-entry interruption-information field". "npt_enabled": 1 when the VCPU has enabled nested paging. The analagous VMCS field is the enable-EPT execution control. "npt_addr": 64-bit field when the VCPU has enabled nested paging. The analagous VMCS field is the ept_pointer. Signed-off-by: David Matlack <dmatlack@google.com> [move the code into the nested_vmx_enter_non_root_mode().] Signed-off-by: Mingwei Zhang <mizhang@google.com> Link: https://lore.kernel.org/r/20220825225755.907001-3-mizhang@google.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:32 -04:00
Mingwei Zhang	89e54ec592	KVM: x86: Update trace function for nested VM entry to support VMX Update trace function for nested VM entry to support VMX. Existing trace function only supports nested VMX and the information printed out is AMD specific. So, rename trace_kvm_nested_vmrun() to trace_kvm_nested_vmenter(), since 'vmenter' is generic. Add a new field 'isa' to recognize Intel and AMD; Update the output to print out VMX/SVM related naming respectively, eg., vmcb vs. vmcs; npt vs. ept. Opportunistically update the call site of trace_kvm_nested_vmenter() to make one line per parameter. Signed-off-by: Mingwei Zhang <mizhang@google.com> Link: https://lore.kernel.org/r/20220825225755.907001-2-mizhang@google.com [sean: align indentation, s/update/rename in changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:32 -04:00
Sean Christopherson	bff0adc40c	KVM: x86: Use u64 for address and error code in page fault tracepoint Track the address and error code as 64-bit values in the page fault tracepoint. When TDP is enabled, the address is a GPA and thus can be a 64-bit value even on 32-bit hosts. And SVM's #NPF genereates 64-bit error codes. Opportunistically clean up the formatting. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:31 -04:00
Wonhyuk Yang	faa03b3972	KVM: Add extra information in kvm_page_fault trace point Currently, kvm_page_fault trace point provide fault_address and error code. However it is not enough to find which cpu and instruction cause kvm_page_faults. So add vcpu id and instruction pointer in kvm_page_fault trace point. Cc: Baik Song An <bsahn@etri.re.kr> Cc: Hong Yeon Kim <kimhy@etri.re.kr> Cc: Taeung Song <taeung@reallinux.co.kr> Cc: linuxgeek@linuxgeek.io Signed-off-by: Wonhyuk Yang <vvghjk1234@gmail.com> Link: https://lore.kernel.org/r/20220510071001.87169-1-vvghjk1234@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:30 -04:00
Paolo Bonzini	db25eb87ad	KVM: SVM: remove unnecessary check on INIT intercept Since svm_check_nested_events() is now handling INIT signals, there is no need to latch it until the VMEXIT is injected. The only condition under which INIT signals are latched is GIF=0. Suggested-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Link: https://lore.kernel.org/r/20220819165643.83692-1-pbonzini@redhat.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:28 -04:00
Uros Bizjak	afe30b59d3	KVM/VMX: Avoid stack engine synchronization uop in __vmx_vcpu_run Avoid instructions with explicit uses of the stack pointer between instructions that implicitly refer to it. The sequence of POP %reg; ADD $x, %RSP; POP %reg forces emission of synchronization uop to synchronize the value of the stack pointer in the stack engine and the out-of-order core. Using POP with the dummy register instead of ADD $x, %RSP results in a smaller code size and faster code. The patch also fixes the reference to the wrong register in the nearby comment. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20220816211010.25693-1-ubizjak@gmail.com Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-26 12:02:27 -04:00
Luciano Leão	30ea703a38	x86/cpu: Include the header of init_ia32_feat_ctl()'s prototype Include the header containing the prototype of init_ia32_feat_ctl(), solving the following warning: $ make W=1 arch/x86/kernel/cpu/feat_ctl.o arch/x86/kernel/cpu/feat_ctl.c:112:6: warning: no previous prototype for ‘init_ia32_feat_ctl’ [-Wmissing-prototypes] 112 \| void init_ia32_feat_ctl(struct cpuinfo_x86 *c) This warning appeared after commit 5d5103595e9e5 ("x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup") had moved the function init_ia32_feat_ctl()'s prototype from arch/x86/kernel/cpu/cpu.h to arch/x86/include/asm/cpu.h. Note that, before the commit mentioned above, the header include "cpu.h" (arch/x86/kernel/cpu/cpu.h) was added by commit 0e79ad863df43 ("x86/cpu: Fix a -Wmissing-prototypes warning for init_ia32_feat_ctl()") solely to fix init_ia32_feat_ctl()'s missing prototype. So, the header include "cpu.h" is no longer necessary. [ bp: Massage commit message. ] Fixes: 5d5103595e9e5 ("x86/cpu: Reinitialize IA32_FEAT_CTL MSR on BSP during wakeup") Signed-off-by: Luciano Leão <lucianorsleao@gmail.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Nícolas F. R. A. Prado <n@nfraprado.net> Link: https://lore.kernel.org/r/20220922200053.1357470-1-lucianorsleao@gmail.com	2022-09-26 17:06:27 +02:00
Taehee Yoo	ba3579e6e4	crypto: aria-avx - add AES-NI/AVX/x86_64/GFNI assembler implementation of aria cipher The implementation is based on the 32-bit implementation of the aria. Also, aria-avx process steps are the similar to the camellia-avx. 1. Byteslice(16way) 2. Add-round-key. 3. Sbox 4. Diffusion layer. Except for s-box, all steps are the same as the aria-generic implementation. s-box step is very similar to camellia and sm4 implementation. There are 2 implementations for s-box step. One is to use AES-NI and affine transformation, which is the same as Camellia, sm4, and others. Another is to use GFNI. GFNI implementation is faster than AES-NI implementation. So, it uses GFNI implementation if the running CPU supports GFNI. There are 4 s-boxes in the ARIA and the 2 s-boxes are the same as AES's s-boxes. To calculate the first sbox, it just uses the aesenclast and then inverts shift_row. No more process is needed for this job because the first s-box is the same as the AES encryption s-box. To calculate the second sbox(invert of s1), it just uses the aesdeclast and then inverts shift_row. No more process is needed for this job because the second s-box is the same as the AES decryption s-box. To calculate the third s-box, it uses the aesenclast, then affine transformation, which is combined AES inverse affine and ARIA S2. To calculate the last s-box, it uses the aesdeclast, then affine transformation, which is combined X2 and AES forward affine. The optimized third and last s-box logic and GFNI s-box logic are implemented by Jussi Kivilinna. The aria-generic implementation is based on a 32-bit implementation, not an 8-bit implementation. the aria-avx Diffusion Layer implementation is based on aria-generic implementation because 8-bit implementation is not fit for parallel implementation but 32-bit is enough to fit for this. Signed-off-by: Taehee Yoo <ap420073@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2022-09-24 16:14:44 +08:00
Linus Torvalds	317fab7ec5	ARM: * Fix for kmemleak with pKVM s390: * Fixes for VFIO with zPCI * smatch fix x86: * Ensure XSAVE-capable hosts always allow FP and SSE state to be saved and restored via KVM_{GET,SET}_XSAVE * Fix broken max_mmu_rmap_size stat * Fix compile error with old glibc that doesn't have gettid() -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmMtvg4UHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroNy2Af/VybWK2uAbaMkV6irQ1YTJIrRPco1 C+JQdiQklYbjzThfPWNF/MiH+VTObloR1KqztOeQbfcrgwzygO68D3bs0wkAukLA mtdcMjdsqNx8r9u533i6S8Dpo0RkHKl+I8+3mHdPHTzlrbCuYJFFFxFNLhE+xbrK DP2Gl/xXIGYwOv2nfHA/xxI7TRICv4IxmzQazxlmC27n6BLNSr8qp6jI9lXJQfJ8 XJh3SbmRux3/cs2oEqONg8DySJh631kI1jGGOmL3qk07ZR7A5KZ+lju0xM7vQIEq aR25YNYZux+BPIY/WxT1R0j6pwinBmFp8OoQYCs8DaQ65fdE5gegtJRpow== =VADm -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "As everyone back came back from conferences, here are the pending patches for Linux 6.0. ARM: - Fix for kmemleak with pKVM s390: - Fixes for VFIO with zPCI - smatch fix x86: - Ensure XSAVE-capable hosts always allow FP and SSE state to be saved and restored via KVM_{GET,SET}_XSAVE - Fix broken max_mmu_rmap_size stat - Fix compile error with old glibc that doesn't have gettid()" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: x86: Inject #UD on emulated XSETBV if XSAVES isn't enabled KVM: x86: Always enable legacy FP/SSE in allowed user XFEATURES KVM: x86: Reinstate kvm_vcpu_arch.guest_supported_xcr0 KVM: x86/mmu: add missing update to max_mmu_rmap_size selftests: kvm: Fix a compile error in selftests/kvm/rseq_test.c KVM: s390: pci: register pci hooks without interpretation KVM: s390: pci: fix GAIT physical vs virtual pointers usage KVM: s390: Pass initialized arg even if unused KVM: s390: pci: fix plain integer as NULL pointer warnings KVM: arm64: Use kmemleak_free_part_phys() to unregister hyp_mem_base	2022-09-23 08:42:30 -07:00
Paolo Bonzini	69604fe76e	More pci fixes Fix for a code analyser warning -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEwGNS88vfc9+v45Yq41TmuOI4ufgFAmMtmHcACgkQ41TmuOI4 ufhapA/+OGjJtUKEa/LkCArUMUgJ+ZYy9cwx3Qrkvm7BtEguN35uxO5T7X8pdMDU /dYSDwtPT1ob/QmjhwwE0dpgUF8yfkBNLKBK37f6jTznI4mS6UEq/IB1BcCk3ss5 45xG8rWiXS7oFm9bxtNsa5jkrSf7gIOuEa570tOVtOlBYvtFIH2SiHvcCUY1w48N A6docb+vOJnPIBHPR/1q+bBoghO5rplYaX0EiAt3EYThrhAjsXjIFzaQPwXUxgFh Nk/3ni1St72sbWDiC7YxjDghMVuW1GuQAlgkDWvy8bdA0IfsaqbyISuyIWZWkRfx V8+7OfGVsCB7qhT4C+65fQPqwZwERNwwvAFsx175/Mm9/AXn217+sN52O54olFcM dhKI7med1R/LqO2HH9kCGIUfJ3t+0JHpFDwcJyiAZsFbYffnwUQbIW4YcA3O+rje cV8/kRhD6ebzrScTOMVH+Rb4Q0N0kL34iHXbI8wcTBKmSFrdRm9YeLtQM2P/HtPA GVyg49bfXfA13goM9jaUW/IgT1aSHNRHfbfFKTUB7ph2Wi5ktMizpkn/CQ8L7bm1 I0JoZfA7ALDufg2Ajyqsqmv3RX2k6ydenmX/AxzQlo34kSXY6X+N+yiPlqhumBqq Gs2pmvQoyvAmuShhj+eSZsQYBObgjzN7B/SAbbUBT71GerGyWeQ= =Z7gl -----END PGP SIGNATURE----- Merge tag 'kvm-s390-master-6.0-2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD More pci fixes Fix for a code analyser warning	2022-09-23 10:06:08 -04:00
James Morse	f7b1843eca	x86/resctrl: Make resctrl_arch_rmid_read() return values in bytes resctrl_arch_rmid_read() returns a value in chunks, as read from the hardware. This needs scaling to bytes by mon_scale, as provided by the architecture code. Now that resctrl_arch_rmid_read() performs the overflow and corrections itself, it may as well return a value in bytes directly. This allows the accesses to the architecture specific 'hw' structure to be removed. Move the mon_scale conversion into resctrl_arch_rmid_read(). mbm_bw_count() is updated to calculate bandwidth from bytes. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Xin Hao <xhao@linux.alibaba.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20220902154829.30399-22-james.morse@arm.com	2022-09-23 14:25:05 +02:00
James Morse	d80975e264	x86/resctrl: Add resctrl_rmid_realloc_limit to abstract x86's boot_cpu_data resctrl_rmid_realloc_threshold can be set by user-space. The maximum value is specified by the architecture. Currently max_threshold_occ_write() reads the maximum value from boot_cpu_data.x86_cache_size, which is not portable to another architecture. Add resctrl_rmid_realloc_limit to describe the maximum size in bytes that user-space can set the threshold to. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Xin Hao <xhao@linux.alibaba.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20220902154829.30399-21-james.morse@arm.com	2022-09-23 14:24:16 +02:00
James Morse	ae2328b529	x86/resctrl: Rename and change the units of resctrl_cqm_threshold resctrl_cqm_threshold is stored in a hardware specific chunk size, but exposed to user-space as bytes. This means the filesystem parts of resctrl need to know how the hardware counts, to convert the user provided byte value to chunks. The interface between the architecture's resctrl code and the filesystem ought to treat everything as bytes. Change the unit of resctrl_cqm_threshold to bytes. resctrl_arch_rmid_read() still returns its value in chunks, so this needs converting to bytes. As all the users have been touched, rename the variable to resctrl_rmid_realloc_threshold, which describes what the value is for. Neither r->num_rmid nor hw_res->mon_scale are guaranteed to be a power of 2, so the existing code introduces a rounding error from resctrl's theoretical fraction of the cache usage. This behaviour is kept as it ensures the user visible value matches the value read from hardware when the rmid will be reallocated. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Xin Hao <xhao@linux.alibaba.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20220902154829.30399-20-james.morse@arm.com	2022-09-23 14:23:41 +02:00
James Morse	38f72f50d6	x86/resctrl: Move get_corrected_mbm_count() into resctrl_arch_rmid_read() resctrl_arch_rmid_read() is intended as the function that an architecture agnostic resctrl filesystem driver can use to read a value in bytes from a counter. Currently the function returns the MBM values in chunks directly from hardware. When reading a bandwidth counter, get_corrected_mbm_count() must be used to correct the value read. get_corrected_mbm_count() is architecture specific, this work should be done in resctrl_arch_rmid_read(). Move the function calls. This allows the resctrl filesystems's chunks value to be removed in favour of the architecture private version. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Xin Hao <xhao@linux.alibaba.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20220902154829.30399-19-james.morse@arm.com	2022-09-23 14:22:53 +02:00
James Morse	1d81d15db3	x86/resctrl: Move mbm_overflow_count() into resctrl_arch_rmid_read() resctrl_arch_rmid_read() is intended as the function that an architecture agnostic resctrl filesystem driver can use to read a value in bytes from a counter. Currently the function returns the MBM values in chunks directly from hardware. When reading a bandwidth counter, mbm_overflow_count() must be used to correct for any possible overflow. mbm_overflow_count() is architecture specific, its behaviour should be part of resctrl_arch_rmid_read(). Move the mbm_overflow_count() calls into resctrl_arch_rmid_read(). This allows the resctrl filesystems's prev_msr to be removed in favour of the architecture private version. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Xin Hao <xhao@linux.alibaba.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20220902154829.30399-18-james.morse@arm.com	2022-09-23 14:22:20 +02:00
James Morse	8286618aca	x86/resctrl: Pass the required parameters into resctrl_arch_rmid_read() resctrl_arch_rmid_read() is intended as the function that an architecture agnostic resctrl filesystem driver can use to read a value in bytes from a hardware register. Currently the function returns the MBM values in chunks directly from hardware. To convert this to bytes, some correction and overflow calculations are needed. These depend on the resource and domain structures. Overflow detection requires the old chunks value. None of this is available to resctrl_arch_rmid_read(). MPAM requires the resource and domain structures to find the MMIO device that holds the registers. Pass the resource and domain to resctrl_arch_rmid_read(). This makes rmid_dirty() too big. Instead merge it with its only caller, and the name is kept as a local variable. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Xin Hao <xhao@linux.alibaba.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20220902154829.30399-17-james.morse@arm.com	2022-09-23 14:21:25 +02:00
James Morse	4d044c521a	x86/resctrl: Abstract __rmid_read() __rmid_read() selects the specified eventid and returns the counter value from the MSR. The error handling is architecture specific, and handled by the callers, rdtgroup_mondata_show() and __mon_event_count(). Error handling should be handled by architecture specific code, as a different architecture may have different requirements. MPAM's counters can report that they are 'not ready', requiring a second read after a short delay. This should be hidden from resctrl. Make __rmid_read() the architecture specific function for reading a counter. Rename it resctrl_arch_rmid_read() and move the error handling into it. A read from a counter that hardware supports but resctrl does not now returns -EINVAL instead of -EIO from the default case in __mon_event_count(). It isn't possible for user-space to see this change as resctrl doesn't expose counters it doesn't support. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Xin Hao <xhao@linux.alibaba.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20220902154829.30399-16-james.morse@arm.com	2022-09-23 14:17:20 +02:00
Kees Cook	712f210a45	x86/microcode/AMD: Track patch allocation size explicitly In preparation for reducing the use of ksize(), record the actual allocation size for later memcpy(). This avoids copying extra (uninitialized!) bytes into the patch buffer when the requested allocation size isn't exactly the size of a kmalloc bucket. Additionally, fix potential future issues where runtime bounds checking will notice that the buffer was allocated to a smaller value than returned by ksize(). Fixes: 757885e94a22 ("x86, microcode, amd: Early microcode patch loading support for AMD") Suggested-by: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Borislav Petkov <bp@suse.de> Link: https://lore.kernel.org/lkml/CA+DvKQ+bp7Y7gmaVhacjv9uF6Ar-o4tet872h4Q8RPYPJjcJQA@mail.gmail.com/	2022-09-23 13:46:26 +02:00
James Morse	fea62d370d	x86/resctrl: Allow per-rmid arch private storage to be reset To abstract the rmid counters into a helper that returns the number of bytes counted, architecture specific per-rmid state is needed. It needs to be possible to reset this hidden state, as the values may outlive the life of an rmid, or the mount time of the filesystem. mon_event_read() is called with first = true when an rmid is first allocated in mkdir_mondata_subdir(). Add resctrl_arch_reset_rmid() and call it from __mon_event_count()'s rr->first check. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov <bp@suse.de> Reviewed-by: Jamie Iles <quic_jiles@quicinc.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Xin Hao <xhao@linux.alibaba.com> Tested-by: Shaopeng Tan <tan.shaopeng@fujitsu.com> Tested-by: Cristian Marussi <cristian.marussi@arm.com> Link: https://lore.kernel.org/r/20220902154829.30399-15-james.morse@arm.com	2022-09-23 12:49:04 +02:00
Sean Christopherson	50b2d49baf	KVM: x86: Inject #UD on emulated XSETBV if XSAVES isn't enabled Inject #UD when emulating XSETBV if CR4.OSXSAVE is not set. This also covers the "XSAVE not supported" check, as setting CR4.OSXSAVE=1 #GPs if XSAVE is not supported (and userspace gets to keep the pieces if it forces incoherent vCPU state). Add a comment to kvm_emulate_xsetbv() to call out that the CPU checks CR4.OSXSAVE before checking for intercepts. AMD'S APM implies that #UD has priority (says that intercepts are checked before #GP exceptions), while Intel's SDM says nothing about interception priority. However, testing on hardware shows that both AMD and Intel CPUs prioritize the #UD over interception. Fixes: 02d4160fbd76 ("x86: KVM: add xsetbv to the emulator") Cc: stable@vger.kernel.org Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220824033057.3576315-4-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-22 17:04:20 -04:00
Dr. David Alan Gilbert	a1020a25e6	KVM: x86: Always enable legacy FP/SSE in allowed user XFEATURES Allow FP and SSE state to be saved and restored via KVM_{G,SET}_XSAVE on XSAVE-capable hosts even if their bits are not exposed to the guest via XCR0. Failing to allow FP+SSE first showed up as a QEMU live migration failure, where migrating a VM from a pre-XSAVE host, e.g. Nehalem, to an XSAVE host failed due to KVM rejecting KVM_SET_XSAVE. However, the bug also causes problems even when migrating between XSAVE-capable hosts as KVM_GET_SAVE won't set any bits in user_xfeatures if XSAVE isn't exposed to the guest, i.e. KVM will fail to actually migrate FP+SSE. Because KVM_{G,S}ET_XSAVE are designed to allowing migrating between hosts with and without XSAVE, KVM_GET_XSAVE on a non-XSAVE (by way of fpu_copy_guest_fpstate_to_uabi()) always sets the FP+SSE bits in the header so that KVM_SET_XSAVE will work even if the new host supports XSAVE. Fixes: ad856280ddea ("x86/kvm/fpu: Limit guest user_xfeatures to supported bits of XCR0") bz: https://bugzilla.redhat.com/show_bug.cgi?id=2079311 Cc: stable@vger.kernel.org Cc: Leonardo Bras <leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> [sean: add comment, massage changelog] Signed-off-by: Sean Christopherson <seanjc@google.com> Message-Id: <20220824033057.3576315-3-seanjc@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-09-22 17:04:19 -04:00

... 3 4 5 6 7 ...

42484 Commits