linux

iv/linux

Author	SHA1	Message	Date
David Matlack	eb29860570	KVM: x86/mmu: Do not recover dirty-tracked NX Huge Pages Do not recover (i.e. zap) an NX Huge Page that is being dirty tracked, as it will just be faulted back in at the same 4KiB granularity when accessed by a vCPU. This may need to be changed if KVM ever supports 2MiB (or larger) dirty tracking granularity, or faulting huge pages during dirty tracking for reads/executes. However for now, these zaps are entirely wasteful. In order to check if this commit increases the CPU usage of the NX recovery worker thread I used a modified version of execute_perf_test [1] that supports splitting guest memory into multiple slots and reports /proc/pid/schedstat:se.sum_exec_runtime for the NX recovery worker just before tearing down the VM. The goal was to force a large number of NX Huge Page recoveries and see if the recovery worker used any more CPU. Test Setup: echo 1000 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms echo 10 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio Test Command: ./execute_perf_test -v64 -s anonymous_hugetlb_1gb -x 16 -o \| kvm-nx-lpage-re:se.sum_exec_runtime \| \| ---------------------------------------- \| Run \| Before \| After \| ------- \| ------------------ \| ------------------- \| 1 \| 730.084105 \| 724.375314 \| 2 \| 728.751339 \| 740.581988 \| 3 \| 736.264720 \| 757.078163 \| Comparing the median results, this commit results in about a 1% increase CPU usage of the NX recovery worker when testing a VM with 16 slots. However, the effect is negligible with the default halving time of NX pages, which is 1 hour rather than 10 seconds given by period_ms = 1000, ratio = 10. [1] https://lore.kernel.org/kvm/20221019234050.3919566-2-dmatlack@google.com/ Signed-off-by: David Matlack <dmatlack@google.com> Message-Id: <20221103204421.1146958-1-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-11-17 11:26:35 -05:00
Paolo Bonzini	63d28a25e0	KVM: x86/mmu: simplify kvm_tdp_mmu_map flow when guest has to retry A removed SPTE is never present, hence the "if" in kvm_tdp_mmu_map only fails in the exact same conditions that the earlier loop tested in order to issue a "break". So, instead of checking twice the condition (upper level SPTEs could not be created or was frozen), just exit the loop with a goto---the usual poor-man C replacement for RAII early returns. While at it, do not use the "ret" variable for return values of functions that do not return a RET_PF_* enum. This is clearer and also makes it possible to initialize ret to RET_PF_RETRY. Suggested-by: Robert Hoo <robert.hu@linux.intel.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-11-17 11:10:25 -05:00
David Matlack	c4b33d28ea	KVM: x86/mmu: Split huge pages mapped by the TDP MMU on fault Now that the TDP MMU has a mechanism to split huge pages, use it in the fault path when a huge page needs to be replaced with a mapping at a lower level. This change reduces the negative performance impact of NX HugePages. Prior to this change if a vCPU executed from a huge page and NX HugePages was enabled, the vCPU would take a fault, zap the huge page, and mapping the faulting address at 4KiB with execute permissions enabled. The rest of the memory would be left unmapped and have to be faulted back in by the guest upon access (read, write, or execute). If guest is backed by 1GiB, a single execute instruction can zap an entire GiB of its physical address space. For example, it can take a VM longer to execute from its memory than to populate that memory in the first place: $ ./execute_perf_test -s anonymous_hugetlb_1gb -v96 Populating memory : 2.748378795s Executing from memory : 2.899670885s With this change, such faults split the huge page instead of zapping it, which avoids the non-present faults on the rest of the huge page: $ ./execute_perf_test -s anonymous_hugetlb_1gb -v96 Populating memory : 2.729544474s Executing from memory : 0.111965688s <--- This change also reduces the performance impact of dirty logging when eager_page_split=N. eager_page_split=N (abbreviated "eps=N" below) can be desirable for read-heavy workloads, as it avoids allocating memory to split huge pages that are never written and avoids increasing the TLB miss cost on reads of those pages. \| Config: ept=Y, tdp_mmu=Y, 5% writes \| \| Iteration 1 dirty memory time \| \| --------------------------------------------- \| vCPU Count \| eps=N (Before) \| eps=N (After) \| eps=Y \| ------------ \| -------------- \| ------------- \| ------------ \| 2 \| 0.332305091s \| 0.019615027s \| 0.006108211s \| 4 \| 0.353096020s \| 0.019452131s \| 0.006214670s \| 8 \| 0.453938562s \| 0.019748246s \| 0.006610997s \| 16 \| 0.719095024s \| 0.019972171s \| 0.007757889s \| 32 \| 1.698727124s \| 0.021361615s \| 0.012274432s \| 64 \| 2.630673582s \| 0.031122014s \| 0.016994683s \| 96 \| 3.016535213s \| 0.062608739s \| 0.044760838s \| Eager page splitting remains beneficial for write-heavy workloads, but the gap is now reduced. \| Config: ept=Y, tdp_mmu=Y, 100% writes \| \| Iteration 1 dirty memory time \| \| --------------------------------------------- \| vCPU Count \| eps=N (Before) \| eps=N (After) \| eps=Y \| ------------ \| -------------- \| ------------- \| ------------ \| 2 \| 0.317710329s \| 0.296204596s \| 0.058689782s \| 4 \| 0.337102375s \| 0.299841017s \| 0.060343076s \| 8 \| 0.386025681s \| 0.297274460s \| 0.060399702s \| 16 \| 0.791462524s \| 0.298942578s \| 0.062508699s \| 32 \| 1.719646014s \| 0.313101996s \| 0.075984855s \| 64 \| 2.527973150s \| 0.455779206s \| 0.079789363s \| 96 \| 2.681123208s \| 0.673778787s \| 0.165386739s \| Further study is needed to determine if the remaining gap is acceptable for customer workloads or if eager_page_split=N still requires a-priori knowledge of the VM workload, especially when considering these costs extrapolated out to large VMs with e.g. 416 vCPUs and 12TB RAM. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Mingwei Zhang <mizhang@google.com> Message-Id: <20221109185905.486172-3-dmatlack@google.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-11-17 10:52:48 -05:00
Paolo Bonzini	92292c1de2	KVM selftests updates for 6.2 perf_util: - Add support for pinning vCPUs in dirty_log_perf_test. - Add a lightweight psuedo RNG for guest use, and use it to randomize the access pattern and write vs. read percentage in the so called "perf util" tests. - Rename the so called "perf_util" framework to "memstress". ucall: - Add a common pool-based ucall implementation (code dedup and pre-work for running SEV (and beyond) guests in selftests. - Fix an issue in ARM's single-step test when using the new pool-based implementation; atomics don't play nice with single-step exceptions. init: - Provide a common constructor and arch hook, which will eventually be used by x86 to automatically select the right hypercall (AMD vs. Intel). x86: - Clean up x86's page tabe management. - Clean up and enhance the "smaller maxphyaddr" test, and add a related test to cover generic emulation failure. - Clean up the nEPT support checks. - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values. -----BEGIN PGP SIGNATURE----- iQJGBAABCgAwFiEEMHr+pfEFOIzK+KY1YJEiAU0MEvkFAmN1iEgSHHNlYW5qY0Bn b29nbGUuY29tAAoJEGCRIgFNDBL5L8IP/0ep959xR64MGFi+6w/nbHW7/nz1dzo3 A+/58HkU0cdMB8ycw5oaN+wwQKE02mCLNRY1GecddL/Y3+L073KXSUAu5dCc5KtW bIiX45c+HEdxsKr474zawvoIETkESo+dblMEptK5D/y+iBdazTIxzwMsiuGrYdCf Xd8deNMzgzZQDB74UeRDvedW6VSMIzw4XT1tTpdKwFkrAmGa+I3hBbcMVWXPrsJ8 cTDFfjeWIX5flJYV7Mjb+WNdgb9av2srLn7d7dOibCoSedseYJr2E1lVA5i9z0N4 Kbv60bA0O4yEi8xpkBwlCQJTT9u3NBkEfB0efr3HNJzKKc1BUWYITBHaZbJOSVsD ZmrqmeKlboRDsfI3wG7HZyzgD+QOJJKi5b1ooV8iIhLbt7iW2jpo9zZoaImYR+dp /VS+aLWLvJcPiepvMp2NM+vQfq0rYyQim3izIeoCqJPKTOpKbRIJU2FVNmXyF6wy ryeuNdYPsnxRpYjWCXTn7c6VSOVhq9D5AbV679TTJC/VpjdJci0U1coTCK3NtnrW Yd4GDmF3dlFI/4b3uH98Xzk7QKowHbGul3FmqoANKUOTE7SgUIHavCaT8FTc80Lr tQhoRqaYnYvkG6jeTzmOCtj9lz7pjMV5+l+dtN+aGHosd76Vjt3hzPdb9r1XeYhr 9CdjnwbQPqKG =t+ut -----END PGP SIGNATURE----- Merge tag 'kvm-selftests-6.2-1' of https://github.com/kvm-x86/linux into HEAD KVM selftests updates for 6.2 perf_util: - Add support for pinning vCPUs in dirty_log_perf_test. - Add a lightweight psuedo RNG for guest use, and use it to randomize the access pattern and write vs. read percentage in the so called "perf util" tests. - Rename the so called "perf_util" framework to "memstress". ucall: - Add a common pool-based ucall implementation (code dedup and pre-work for running SEV (and beyond) guests in selftests. - Fix an issue in ARM's single-step test when using the new pool-based implementation; LDREX/STREX don't play nice with single-step exceptions. init: - Provide a common constructor and arch hook, which will eventually be used by x86 to automatically select the right hypercall (AMD vs. Intel). x86: - Clean up x86's page tabe management. - Clean up and enhance the "smaller maxphyaddr" test, and add a related test to cover generic emulation failure. - Clean up the nEPT support checks. - Add X86_PROPERTY_* framework to retrieve multi-bit CPUID values.	2022-11-17 09:09:29 -05:00
David Matlack	ecb89a5172	KVM: selftests: Check for KVM nEPT support using "feature" MSRs When checking for nEPT support in KVM, use kvm_get_feature_msr() instead of vcpu_get_msr() to retrieve KVM's default TRUE_PROCBASED_CTLS and PROCBASED_CTLS2 MSR values, i.e. don't require a VM+vCPU to query nEPT support. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20220927165209.930904-1-dmatlack@google.com [sean: rebase on merged code, write changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:59:07 -08:00
David Matlack	5c107f7085	KVM: selftests: Assert in prepare_eptp() that nEPT is supported Now that a VM isn't needed to check for nEPT support, assert that KVM supports nEPT in prepare_eptp() instead of skipping the test, and push the TEST_REQUIRE() check out to individual tests. The require+assert are somewhat redundant and will incur some amount of ongoing maintenance burden, but placing the "require" logic in the test makes it easier to find/understand a test's requirements and in this case, provides a very strong hint that the test cares about nEPT. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20220927165209.930904-1-dmatlack@google.com [sean: rebase on merged code, write changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:59:07 -08:00
Sean Christopherson	b941ba2380	KVM: selftests: Drop helpers for getting specific KVM supported CPUID entry Drop kvm_get_supported_cpuid_entry() and its inner helper now that all known usage can use X86_FEATURE_, X86_PROPERTY_, X86_PMU_FEATURE_*, or the dedicated Family/Model helpers. Providing "raw" access to CPUID leafs is undesirable as it encourages open coding CPUID checks, which is often error prone and not self-documenting. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-13-seanjc@google.com	2022-11-16 16:59:07 -08:00
Sean Christopherson	074e9d4c9c	KVM: selftests: Add and use KVM helpers for x86 Family and Model Add KVM variants of the x86 Family and Model helpers, and use them in the PMU event filter test. Open code the retrieval of KVM's supported CPUID entry 0x1.0 in anticipation of dropping kvm_get_supported_cpuid_entry(). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-12-seanjc@google.com	2022-11-16 16:59:06 -08:00
Sean Christopherson	24f3f9898e	KVM: selftests: Add dedicated helpers for getting x86 Family and Model Add dedicated helpers for getting x86's Family and Model, which are the last holdouts that "need" raw access to CPUID information. FMS info is a mess and requires not only splicing together multiple values, but requires doing so conditional in the Family case. Provide wrappers to reduce the odds of copy+paste errors, but mostly to allow for the eventual removal of kvm_get_supported_cpuid_entry(). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-11-seanjc@google.com	2022-11-16 16:59:06 -08:00
Sean Christopherson	5228c02a4c	KVM: selftests: Add PMU feature framework, use in PMU event filter test Add an X86_PMU_FEATURE_* framework to simplify probing architectural events on Intel PMUs, which require checking the length of a bit vector and the _absence_ of a "feature" bit. Add helpers for both KVM and "this CPU", and use the newfangled magic (along with X86_PROPERTY_*) to clean up pmu_event_filter_test. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-10-seanjc@google.com	2022-11-16 16:59:05 -08:00
Sean Christopherson	4feb9d21a4	KVM: selftests: Convert vmx_pmu_caps_test to use X86_PROPERTY_* Add X86_PROPERTY_PMU_VERSION and use it in vmx_pmu_caps_test to replace open coded versions of the same functionality. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-9-seanjc@google.com	2022-11-16 16:59:05 -08:00
Sean Christopherson	5dc19f1c7d	KVM: selftests: Convert AMX test to use X86_PROPRETY_XXX Add and use x86 "properties" for the myriad AMX CPUID values that are validated by the AMX test. Drop most of the test's single-usage helpers so that the asserts more precisely capture what check failed. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-8-seanjc@google.com	2022-11-16 16:59:05 -08:00
Sean Christopherson	40854713e3	KVM: selftests: Add kvm_cpu_() support for X86_PROPERTY_ Extent X86_PROPERTY_* support to KVM, i.e. add kvm_cpu_property() and kvm_cpu_has_p(), and use the new helpers in kvm_get_cpu_address_width(). No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-7-seanjc@google.com	2022-11-16 16:59:04 -08:00
Sean Christopherson	a29e6e383b	KVM: selftests: Refactor kvm_cpuid_has() to prep for X86_PROPERTY_* support Refactor kvm_cpuid_has() to prepare for extending X86_PROPERTY_* support to KVM as well as "this CPU". No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-6-seanjc@google.com	2022-11-16 16:59:04 -08:00
Sean Christopherson	d80ddad2a8	KVM: selftests: Use X86_PROPERTY_MAX_KVM_LEAF in CPUID test Use X86_PROPERTY_MAX_KVM_LEAF to replace the equivalent open coded check on KVM's maximum paravirt CPUID leaf. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-5-seanjc@google.com	2022-11-16 16:59:03 -08:00
Sean Christopherson	53a7dc0f21	KVM: selftests: Add X86_PROPERTY_* framework to retrieve CPUID values Introduce X86_PROPERTY_* to allow retrieving values/properties from CPUID leafs, e.g. MAXPHYADDR from CPUID.0x80000008. Use the same core code as X86_FEATURE_*, the primary difference is that properties are multi-bit values, whereas features enumerate a single bit. Add this_cpu_has_p() to allow querying whether or not a property exists based on the maximum leaf associated with the property, e.g. MAXPHYADDR doesn't exist if the max leaf for 0x8000_xxxx is less than 0x8000_0008. Use the new property infrastructure in vm_compute_max_gfn() to prove that the code works as intended. Future patches will convert additional selftests code. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-4-seanjc@google.com	2022-11-16 16:59:03 -08:00
Sean Christopherson	ee37955366	KVM: selftests: Refactor X86_FEATURE_* framework to prep for X86_PROPERTY_* Refactor the X86_FEATURE_* framework to prepare for extending the core logic to support "properties". The "feature" framework allows querying a single CPUID bit to detect the presence of a feature; the "property" framework will extend the idea to allow querying a value, i.e. to get a value that is a set of contiguous bits in a CPUID leaf. Opportunistically add static asserts to ensure features are fully defined at compile time, and to try and catch mistakes in the definition of features. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-3-seanjc@google.com	2022-11-16 16:59:03 -08:00
Sean Christopherson	3bd396353d	KVM: selftests: Add X86_FEATURE_PAE and use it calc "fallback" MAXPHYADDR Add X86_FEATURE_PAE and use it to guesstimate the MAXPHYADDR when the MAXPHYADDR CPUID entry isn't supported. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006005125.680782-2-seanjc@google.com	2022-11-16 16:59:02 -08:00
David Matlack	3ae5b759c3	KVM: selftests: Add a test for KVM_CAP_EXIT_ON_EMULATION_FAILURE Add a selftest to exercise the KVM_CAP_EXIT_ON_EMULATION_FAILURE capability. This capability is also exercised through smaller_maxphyaddr_emulation_test, but that test requires allow_smaller_maxphyaddr=Y, which is off by default on Intel when ept=Y and unconditionally disabled on AMD when npt=Y. This new test ensures that KVM_CAP_EXIT_ON_EMULATION_FAILURE is exercised independent of allow_smaller_maxphyaddr. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20221102184654.282799-11-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:59:02 -08:00
David Matlack	a323845d6c	KVM: selftests: Expect #PF(RSVD) when TDP is disabled Change smaller_maxphyaddr_emulation_test to expect a #PF(RSVD), rather than an emulation failure, when TDP is disabled. KVM only needs to emulate instructions to emulate a smaller guest.MAXPHYADDR when TDP is enabled. Fixes: `39bbcc3a4e` ("selftests: kvm: Allows userspace to handle emulation errors.") Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20221102184654.282799-10-dmatlack@google.com [sean: massage comment to talk about having to emulate due to MAXPHYADDR] Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:59:01 -08:00
Sean Christopherson	b9635930f0	KVM: selftests: Provide error code as a KVM_ASM_SAFE() output Provide the error code on a fault in KVM_ASM_SAFE(), e.g. to allow tests to assert that #PF generates the correct error code without needing to manually install a #PF handler. Use r10 as the scratch register for the error code, as it's already clobbered by the asm blob (loaded with the RIP of the to-be-executed instruction). Deliberately load the output "error_code" even in the non-faulting path so that error_code is always initialized with deterministic data (the aforementioned RIP), i.e to ensure a selftest won't end up with uninitialized consumption regardless of how KVM_ASM_SAFE() is used. Don't clear r10 in the non-faulting case and instead load error code with the RIP (see above). The error code is valid if and only if an exception occurs, and '0' isn't necessarily a better "invalid" value, e.g. '0' could result in false passes for a buggy test. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20221102184654.282799-9-dmatlack@google.com	2022-11-16 16:59:01 -08:00
Sean Christopherson	f2e5b53b4b	KVM: selftests: Avoid JMP in non-faulting path of KVM_ASM_SAFE() Clear R9 in the non-faulting path of KVM_ASM_SAFE() and fall through to to a common load of "vector" to effectively load "vector" with '0' to reduce the code footprint of the asm blob, to reduce the runtime overhead of the non-faulting path (when "vector" is stored in a register), and so that additional output constraints that are valid if and only if a fault occur are loaded even in the non-faulting case. A future patch will add a 64-bit output for the error code, and if its output is not explicitly loaded with _something_, the user of the asm blob can end up technically consuming uninitialized data. Using a common path to load the output constraints will allow using an existing scratch register, e.g. r10, to hold the error code in the faulting path, while also guaranteeing the error code is initialized with deterministic data in the non-faulting patch (r10 is loaded with the RIP of to-be-executed instruction). Consuming the error code when a fault doesn't occur would obviously be a test bug, but there's no guarantee the compiler will detect uninitialized consumption. And conversely, it's theoretically possible that the compiler might throw a false positive on uninitialized data, e.g. if the compiler can't determine that the non-faulting path won't touch the error code. Alternatively, the error code could be explicitly loaded in the non-faulting path, but loading a 64-bit memory\|register output operand with an explicitl value requires a sign-extended "MOV imm32, r/m64", which isn't exactly straightforward and has a largish code footprint. And loading the error code with what is effectively garbage (from a scratch register) avoids having to choose an arbitrary value for the non-faulting case. Opportunistically remove a rogue asterisk in the block comment. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20221102184654.282799-8-dmatlack@google.com	2022-11-16 16:59:00 -08:00
David Matlack	77f7813cc2	KVM: selftests: Copy KVM PFERR masks into selftests Copy KVM's macros for page fault error masks into processor.h so they can be used in selftests. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102184654.282799-7-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:59:00 -08:00
David Matlack	d6ecfe976a	KVM: x86/mmu: Use BIT{,_ULL}() for PFERR masks Use the preferred BIT() and BIT_ULL() to construct the PFERR masks rather than open-coding the bit shifting. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102184654.282799-6-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:59:00 -08:00
David Matlack	19a2b32f5d	KVM: selftests: Move flds instruction emulation failure handling to header Move the flds instruction emulation failure handling code to a header so it can be re-used in an upcoming test. No functional change intended. Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20221102184654.282799-5-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:58:59 -08:00
David Matlack	50824c6eee	KVM: selftests: Delete dead ucall code Delete a bunch of code related to ucall handling from smaller_maxphyaddr_emulation_test. The only thing smaller_maxphyaddr_emulation_test needs to check is that the vCPU exits with UCALL_DONE after the second vcpu_run(). Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102184654.282799-4-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:58:59 -08:00
David Matlack	48e5937339	KVM: selftests: Explicitly require instructions bytes Hard-code the flds instruction and assert the exact instruction bytes are present in run->emulation_failure. The test already requires the instruction bytes to be present because that's the only way the test will advance the RIP past the flds and get to GUEST_DONE(). Note that KVM does not necessarily return exactly 2 bytes in run->emulation_failure since it may not know the exact instruction length in all cases. So just assert that run->emulation_failure.insn_size is at least 2. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102184654.282799-3-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:58:58 -08:00
David Matlack	52d3a4fb5b	KVM: selftests: Rename emulator_error_test to smaller_maxphyaddr_emulation_test Rename emulator_error_test to smaller_maxphyaddr_emulation_test and update the comment at the top of the file to document that this is explicitly a test to validate that KVM emulates instructions in response to an EPT violation when emulating a smaller MAXPHYADDR. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221102184654.282799-2-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:58:58 -08:00
Gautam Menghani	376bc1b458	KVM: selftests: Don't assume vcpu->id is '0' in xAPIC state test In xapic_state_test's test_icr(), explicitly skip iterations that would match vcpu->id instead of assuming vcpu->id is '0', so that IPIs are are correctly sent to non-existent vCPUs. Suggested-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/kvm/YyoZr9rXSSMEtdh5@google.com Signed-off-by: Gautam Menghani <gautammenghani201@gmail.com> Link: https://lore.kernel.org/r/20221017175819.12672-1-gautammenghani201@gmail.com [sean: massage shortlog and changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:58:58 -08:00
Vishal Annapurve	2115713cfa	KVM: selftests: Add arch specific post vm creation hook Add arch specific API kvm_arch_vm_post_create to perform any required setup after VM creation. Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Reviewed-by: Peter Gonda <pgonda@google.com> Signed-off-by: Vishal Annapurve <vannapurve@google.com> Link: https://lore.kernel.org/r/20221115213845.3348210-4-vannapurve@google.com [sean: place x86's implementation by vm_arch_vcpu_add()] Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:58:57 -08:00
Vishal Annapurve	e1ab31245c	KVM: selftests: Add arch specific initialization Introduce arch specific API: kvm_selftest_arch_init to allow each arch to handle initialization before running any selftest logic. Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Reviewed-by: Peter Gonda <pgonda@google.com> Signed-off-by: Vishal Annapurve <vannapurve@google.com> Link: https://lore.kernel.org/r/20221115213845.3348210-3-vannapurve@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:58:57 -08:00
Vishal Annapurve	197ebb713a	KVM: selftests: move common startup logic to kvm_util.c Consolidate common startup logic in one place by implementing a single setup function with __attribute((constructor)) for all selftests within kvm_util.c. This allows moving logic like: /* Tell stdout not to buffer its content */ setbuf(stdout, NULL); to a single file for all selftests. This will also allow any required setup at entry in future to be done in common main function. Link: https://lore.kernel.org/lkml/Ywa9T+jKUpaHLu%2Fl@google.com Suggested-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Reviewed-by: Peter Gonda <pgonda@google.com> Signed-off-by: Vishal Annapurve <vannapurve@google.com> Link: https://lore.kernel.org/r/20221115213845.3348210-2-vannapurve@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:58:56 -08:00
Sean Christopherson	96b69958c7	KVM: selftests: Play nice with huge pages when getting PTEs/GPAs Play nice with huge pages when getting PTEs and translating GVAs to GPAs, there's no reason to disallow using huge pages in selftests. Use PG_LEVEL_NONE to indicate that the caller doesn't care about the mapping level and just wants to get the pte+level. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-8-seanjc@google.com	2022-11-16 16:58:56 -08:00
Sean Christopherson	efe91dc307	KVM: selftests: Use vm_get_page_table_entry() in addr_arch_gva2gpa() Use vm_get_page_table_entry() in addr_arch_gva2gpa() to get the leaf PTE instead of manually walking page tables. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-7-seanjc@google.com	2022-11-16 16:58:56 -08:00
Sean Christopherson	99d51c6eef	KVM: selftests: Use virt_get_pte() when getting PTE pointer Use virt_get_pte() in vm_get_page_table_entry() instead of open coding equivalent code. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-6-seanjc@google.com	2022-11-16 16:58:55 -08:00
Sean Christopherson	ed0b58fc6f	KVM: selftests: Verify parent PTE is PRESENT when getting child PTE Verify the parent PTE is PRESENT when getting a child via virt_get_pte() so that the helper can be used for getting PTEs/GPAs without losing sanity checks that the walker isn't wandering into the weeds. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-5-seanjc@google.com	2022-11-16 16:58:55 -08:00
Sean Christopherson	91add12d38	KVM: selftests: Remove useless shifts when creating guest page tables Remove the pointless shift from GPA=>GFN and immediately back to GFN=>GPA when creating guest page tables. Ignore the other walkers that have a similar pattern for the moment, they will be converted to use virt_get_pte() in the near future. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-4-seanjc@google.com	2022-11-16 16:58:55 -08:00
Sean Christopherson	751f280017	KVM: selftests: Drop reserved bit checks from PTE accessor Drop the reserved bit checks from the helper to retrieve a PTE, there's very little value in sanity checking the constructed page tables as any will quickly be noticed in the form of an unexpected #PF. The checks also place unnecessary restrictions on the usage of the helpers, e.g. if a test _wanted_ to set reserved bits for whatever reason. Removing the NX check in particular allows for the removal of the @vcpu param, which will in turn allow the helper to be reused nearly verbatim for addr_gva2gpa(). Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-3-seanjc@google.com	2022-11-16 16:58:54 -08:00
Sean Christopherson	816c54b747	KVM: selftests: Drop helpers to read/write page table entries Drop vm_{g,s}et_page_table_entry() and instead expose the "inner" helper (was _vm_get_page_table_entry()) that returns a _pointer_ to the PTE, i.e. let tests directly modify PTEs instead of bouncing through helpers that just make life difficult. Opportunsitically use BIT_ULL() in emulator_error_test, and use the MAXPHYADDR define to set the "rogue" GPA bit instead of open coding the same value. No functional change intended. Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006004512.666529-2-seanjc@google.com	2022-11-16 16:58:54 -08:00
Colin Ian King	9a6418dacd	KVM: selftests: Fix spelling mistake "begining" -> "beginning" There is a spelling mistake in an assert message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Reviewed-by: Jim Mattson <jmattson@google.com> Link: https://lore.kernel.org/r/20220928213458.64089-1-colin.i.king@gmail.com [sean: fix an ironic typo in the changelog] Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 16:58:53 -08:00
Peter Gonda	426729b2cf	KVM: selftests: Add ucall pool based implementation To play nice with guests whose stack memory is encrypted, e.g. AMD SEV, introduce a new "ucall pool" implementation that passes the ucall struct via dedicated memory (which can be mapped shared, a.k.a. as plain text). Because not all architectures have access to the vCPU index in the guest, use a bitmap with atomic accesses to track which entries in the pool are free/used. A list+lock could also work in theory, but synchronizing the individual pointers to the guest would be a mess. Note, there's no need to rewalk the bitmap to ensure success. If all vCPUs are simply allocating, success is guaranteed because there are enough entries for all vCPUs. If one or more vCPUs are freeing and then reallocating, success is guaranteed because vCPUs _always_ walk the bitmap from 0=>N; if vCPU frees an entry and then wins a race to re-allocate, then either it will consume the entry it just freed (bit is the first free bit), or the losing vCPU is guaranteed to see the freed bit (winner consumes an earlier bit, which the loser hasn't yet visited). Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Signed-off-by: Peter Gonda <pgonda@google.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-8-seanjc@google.com	2022-11-16 16:58:53 -08:00
Sean Christopherson	28a65567ac	KVM: selftests: Drop now-unnecessary ucall_uninit() Drop ucall_uninit() and ucall_arch_uninit() now that ARM doesn't modify the host's copy of ucall_exit_mmio_addr, i.e. now that there's no need to reset the pointer before potentially creating a new VM. The few calls to ucall_uninit() are all immediately followed by kvm_vm_free(), and that is likely always going to hold true, i.e. it's extremely unlikely a test will want to effectively disable ucall in the middle of a test. Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-7-seanjc@google.com	2022-11-16 16:58:53 -08:00
Sean Christopherson	03b4750533	KVM: selftests: Make arm64's MMIO ucall multi-VM friendly Fix a mostly-theoretical bug where ARM's ucall MMIO setup could result in different VMs stomping on each other by cloberring the global pointer. Fix the most obvious issue by saving the MMIO gpa into the VM. A more subtle bug is that creating VMs in parallel (on multiple tasks) could result in a VM using the wrong address. Synchronizing a global to a guest effectively snapshots the value on a per-VM basis, i.e. the "global" is already prepped to work with multiple VMs, but setting the global in the host is not thread-safe. To fix that bug, add write_guest_global() to allow stuffing a VM's copy of a "global" without modifying the host value. Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-6-seanjc@google.com	2022-11-16 16:58:52 -08:00
Peter Gonda	cf4694be2b	tools: Add atomic_test_and_set_bit() Add x86 and generic implementations of atomic_test_and_set_bit() to allow KVM selftests to atomically manage bitmaps. Note, the generic version is taken from arch_test_and_set_bit() as of commit `415d832497` ("locking/atomic: Make test_and_*_bit() ordered on failure"). Signed-off-by: Peter Gonda <pgonda@google.com> Co-developed-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-5-seanjc@google.com	2022-11-16 16:58:52 -08:00
Sean Christopherson	dc88244bf5	KVM: selftests: Automatically do init_ucall() for non-barebones VMs Do init_ucall() automatically during VM creation to kill two (three?) birds with one stone. First, initializing ucall immediately after VM creations allows forcing aarch64's MMIO ucall address to immediately follow memslot0. This is still somewhat fragile as tests could clobber the MMIO address with a new memslot, but it's safe-ish since tests have to be conversative when accounting for memslot0. And this can be hardened in the future by creating a read-only memslot for the MMIO page (KVM ARM exits with MMIO if the guest writes to a read-only memslot). Add a TODO to document that selftests can and should use a memslot for the ucall MMIO (doing so requires yet more rework because tests assumes thay can use all memslots except memslot0). Second, initializing ucall for all VMs prepares for making ucall initialization meaningful on all architectures. aarch64 is currently the only arch that needs to do any setup, but that will change in the future by switching to a pool-based implementation (instead of the current stack-based approach). Lastly, defining the ucall MMIO address from common code will simplify switching all architectures (except s390) to a common MMIO-based ucall implementation (if there's ever sufficient motivation to do so). Cc: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-4-seanjc@google.com	2022-11-16 16:58:51 -08:00
Sean Christopherson	ef38871eb2	KVM: selftests: Consolidate boilerplate code in get_ucall() Consolidate the actual copying of a ucall struct from guest=>host into the common get_ucall(). Return a host virtual address instead of a guest virtual address even though the addr_gva2hva() part could be moved to get_ucall() too. Conceptually, get_ucall() is invoked from the host and should return a host virtual address (and returning NULL for "nothing to see here" is far superior to returning 0). Use pointer shenanigans instead of an unnecessary bounce buffer when the caller of get_ucall() provides a valid pointer. Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-3-seanjc@google.com	2022-11-16 16:58:51 -08:00
Sean Christopherson	7046638192	KVM: selftests: Consolidate common code for populating ucall struct Make ucall() a common helper that populates struct ucall, and only calls into arch code to make the actually call out to userspace. Rename all arch-specific helpers to make it clear they're arch-specific, and to avoid collisions with common helpers (one more on its way...) Add WRITE_ONCE() to stores in ucall() code (as already done to aarch64 code in commit `9e2f6498ef` ("selftests: KVM: Handle compiler optimizations in ucall")) to prevent clang optimizations breaking ucalls. Cc: Colton Lewis <coltonlewis@google.com> Reviewed-by: Andrew Jones <andrew.jones@linux.dev> Tested-by: Peter Gonda <pgonda@google.com> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221006003409.649993-2-seanjc@google.com	2022-11-16 16:58:51 -08:00
Sean Christopherson	b3d937722d	KVM: arm64: selftests: Disable single-step without relying on ucall() Automatically disable single-step when the guest reaches the end of the verified section instead of using an explicit ucall() to ask userspace to disable single-step. An upcoming change to implement a pool-based scheme for ucall() will add an atomic operation (bit test and set) in the guest ucall code, and if the compiler generate "old school" atomics, e.g. 40e57c: c85f7c20 ldxr x0, [x1] 40e580: aa100011 orr x17, x0, x16 40e584: c80ffc31 stlxr w15, x17, [x1] 40e588: 35ffffaf cbnz w15, 40e57c <__aarch64_ldset8_sync+0x1c> the guest will hang as the local exclusive monitor is reset by eret, i.e. the stlxr will always fail due to the debug exception taken to EL2. Link: https://lore.kernel.org/all/20221006003409.649993-8-seanjc@google.com Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Marc Zyngier <maz@kernel.org> Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221117002350.2178351-3-seanjc@google.com Reviewed-by: Oliver Upton <oliver.upton@linux.dev>	2022-11-16 16:58:14 -08:00
Sean Christopherson	1cec8bbc17	KVM: arm64: selftests: Disable single-step with correct KVM define Disable single-step by setting debug.control to KVM_GUESTDBG_ENABLE, not to SINGLE_STEP_DISABLE. The latter is an arbitrary test enum that just happens to have the same value as KVM_GUESTDBG_ENABLE, and so effectively disables single-step debug. No functional change intended. Cc: Reiji Watanabe <reijiw@google.com> Fixes: `b18e4d4aeb` ("KVM: arm64: selftests: Add a test case for KVM_GUESTDBG_SINGLESTEP") Signed-off-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221117002350.2178351-2-seanjc@google.com Reviewed-by: Oliver Upton <oliver.upton@linux.dev>	2022-11-16 16:57:59 -08:00
David Matlack	7812d80c0f	KVM: selftests: Rename perf_test_util symbols to memstress Replace the perf_test_ prefix on symbol names with memstress_ to match the new file name. "memstress" better describes the functionality proveded by this library, which is to provide functionality for creating and running a VM that stresses VM memory by reading and writing to guest memory on all vCPUs in parallel. "memstress" also contains the same number of chracters as "perf_test", making it a drop-in replacement in symbols, e.g. function names, without impacting line lengths. Also the lack of underscore between "mem" and "stress" makes it clear "memstress" is a noun. Signed-off-by: David Matlack <dmatlack@google.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Link: https://lore.kernel.org/r/20221012165729.3505266-4-dmatlack@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>	2022-11-16 10:58:32 -08:00

1 2 3 4 5 ...

1137438 Commits