2007-10-10 19:16:19 +04:00
/*
* Kernel - based Virtual Machine driver for Linux
*
* derived from drivers / kvm / kvm_main . c
*
* Copyright ( C ) 2006 Qumranet , Inc .
2008-07-28 20:26:26 +04:00
* Copyright ( C ) 2008 Qumranet , Inc .
* Copyright IBM Corporation , 2008
2010-10-06 16:23:22 +04:00
* Copyright 2010 Red Hat , Inc . and / or its affiliates .
2007-10-10 19:16:19 +04:00
*
* Authors :
* Avi Kivity < avi @ qumranet . com >
* Yaniv Kamay < yaniv @ qumranet . com >
2008-07-28 20:26:26 +04:00
* Amit Shah < amit . shah @ qumranet . com >
* Ben - Ami Yassour < benami @ il . ibm . com >
2007-10-10 19:16:19 +04:00
*
* This work is licensed under the terms of the GNU GPL , version 2. See
* the COPYING file in the top - level directory .
*
*/
2007-12-16 12:02:48 +03:00
# include <linux/kvm_host.h>
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
# include "irq.h"
2007-12-14 04:35:10 +03:00
# include "mmu.h"
2008-01-28 00:10:22 +03:00
# include "i8254.h"
2008-03-25 00:14:53 +03:00
# include "tss.h"
2008-06-27 21:58:02 +04:00
# include "kvm_cache_regs.h"
2008-07-03 15:59:22 +04:00
# include "x86.h"
2011-11-23 18:30:32 +04:00
# include "cpuid.h"
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
2008-02-15 22:52:47 +03:00
# include <linux/clocksource.h>
2008-07-28 20:26:26 +04:00
# include <linux/interrupt.h>
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
# include <linux/kvm.h>
# include <linux/fs.h>
# include <linux/vmalloc.h>
2007-10-29 18:08:51 +03:00
# include <linux/module.h>
2007-11-20 11:25:04 +03:00
# include <linux/mman.h>
2007-12-12 18:46:12 +03:00
# include <linux/highmem.h>
2008-12-03 16:43:34 +03:00
# include <linux/iommu.h>
2008-09-14 04:48:28 +04:00
# include <linux/intel-iommu.h>
2009-02-04 19:52:04 +03:00
# include <linux/cpufreq.h>
2009-09-07 12:12:18 +04:00
# include <linux/user-return-notifier.h>
2009-12-23 19:35:23 +03:00
# include <linux/srcu.h>
include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
2010-03-24 11:04:11 +03:00
# include <linux/slab.h>
2010-04-19 09:32:45 +04:00
# include <linux/perf_event.h>
2010-06-02 13:06:03 +04:00
# include <linux/uaccess.h>
2010-10-14 13:22:46 +04:00
# include <linux/hash.h>
2011-09-06 20:46:34 +04:00
# include <linux/pci.h>
2009-07-01 17:01:02 +04:00
# include <trace/events/kvm.h>
2010-03-10 14:00:43 +03:00
2009-06-17 16:22:14 +04:00
# define CREATE_TRACE_POINTS
# include "trace.h"
2007-10-10 19:16:19 +04:00
2009-09-09 21:22:48 +04:00
# include <asm/debugreg.h>
2007-11-14 15:08:51 +03:00
# include <asm/msr.h>
2008-02-20 18:57:21 +03:00
# include <asm/desc.h>
2008-10-09 12:01:54 +04:00
# include <asm/mtrr.h>
2009-05-11 12:48:15 +04:00
# include <asm/mce.h>
2010-05-17 13:08:27 +04:00
# include <asm/i387.h>
2012-02-22 01:19:22 +04:00
# include <asm/fpu-internal.h> /* Ugh! */
2010-05-17 13:08:28 +04:00
# include <asm/xcr.h>
2010-08-20 12:07:30 +04:00
# include <asm/pvclock.h>
2010-08-26 14:38:03 +04:00
# include <asm/div64.h>
2007-10-10 19:16:19 +04:00
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
# define MAX_IO_MSRS 256
2009-05-11 12:48:15 +04:00
# define KVM_MAX_MCE_BANKS 32
2010-10-08 12:24:14 +04:00
# define KVM_MCE_CAP_SUPPORTED (MCG_CTL_P | MCG_SER_P)
2009-05-11 12:48:15 +04:00
2011-04-20 14:37:53 +04:00
# define emul_to_vcpu(ctxt) \
container_of ( ctxt , struct kvm_vcpu , arch . emulate_ctxt )
2008-01-31 16:57:38 +03:00
/* EFER defaults:
* - enable syscall per default because its emulated by KVM
* - enable LME and LMA per default on 64 bit KVM
*/
# ifdef CONFIG_X86_64
2011-02-21 06:51:35 +03:00
static
u64 __read_mostly efer_reserved_bits = ~ ( ( u64 ) ( EFER_SCE | EFER_LME | EFER_LMA ) ) ;
2008-01-31 16:57:38 +03:00
# else
2011-02-21 06:51:35 +03:00
static u64 __read_mostly efer_reserved_bits = ~ ( ( u64 ) EFER_SCE ) ;
2008-01-31 16:57:38 +03:00
# endif
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
2007-11-18 17:24:12 +03:00
# define VM_STAT(x) offsetof(struct kvm, stat.x), KVM_STAT_VM
# define VCPU_STAT(x) offsetof(struct kvm_vcpu, stat.x), KVM_STAT_VCPU
2007-11-01 01:24:23 +03:00
2009-08-09 16:17:40 +04:00
static void update_cr8_intercept ( struct kvm_vcpu * vcpu ) ;
2011-09-20 14:43:14 +04:00
static void process_nmi ( struct kvm_vcpu * vcpu ) ;
2008-02-11 19:37:23 +03:00
2007-11-14 15:09:30 +03:00
struct kvm_x86_ops * kvm_x86_ops ;
2008-06-27 21:58:02 +04:00
EXPORT_SYMBOL_GPL ( kvm_x86_ops ) ;
2007-11-14 15:09:30 +03:00
2012-01-13 03:02:18 +04:00
static bool ignore_msrs = 0 ;
module_param ( ignore_msrs , bool , S_IRUGO | S_IWUSR ) ;
2009-06-25 14:36:49 +04:00
2011-03-25 11:44:51 +03:00
bool kvm_has_tsc_control ;
EXPORT_SYMBOL_GPL ( kvm_has_tsc_control ) ;
u32 kvm_max_guest_tsc_khz ;
EXPORT_SYMBOL_GPL ( kvm_max_guest_tsc_khz ) ;
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
/* tsc tolerance in parts per million - default to 1/2 of the NTP threshold */
static u32 tsc_tolerance_ppm = 250 ;
module_param ( tsc_tolerance_ppm , uint , S_IRUGO | S_IWUSR ) ;
2009-09-07 12:12:18 +04:00
# define KVM_NR_SHARED_MSRS 16
struct kvm_shared_msrs_global {
int nr ;
2009-12-18 11:48:44 +03:00
u32 msrs [ KVM_NR_SHARED_MSRS ] ;
2009-09-07 12:12:18 +04:00
} ;
struct kvm_shared_msrs {
struct user_return_notifier urn ;
bool registered ;
2009-12-18 11:48:44 +03:00
struct kvm_shared_msr_values {
u64 host ;
u64 curr ;
} values [ KVM_NR_SHARED_MSRS ] ;
2009-09-07 12:12:18 +04:00
} ;
static struct kvm_shared_msrs_global __read_mostly shared_msrs_global ;
static DEFINE_PER_CPU ( struct kvm_shared_msrs , shared_msrs ) ;
2007-11-01 01:24:23 +03:00
struct kvm_stats_debugfs_item debugfs_entries [ ] = {
2007-11-18 17:24:12 +03:00
{ " pf_fixed " , VCPU_STAT ( pf_fixed ) } ,
{ " pf_guest " , VCPU_STAT ( pf_guest ) } ,
{ " tlb_flush " , VCPU_STAT ( tlb_flush ) } ,
{ " invlpg " , VCPU_STAT ( invlpg ) } ,
{ " exits " , VCPU_STAT ( exits ) } ,
{ " io_exits " , VCPU_STAT ( io_exits ) } ,
{ " mmio_exits " , VCPU_STAT ( mmio_exits ) } ,
{ " signal_exits " , VCPU_STAT ( signal_exits ) } ,
{ " irq_window " , VCPU_STAT ( irq_window_exits ) } ,
2008-05-15 14:23:25 +04:00
{ " nmi_window " , VCPU_STAT ( nmi_window_exits ) } ,
2007-11-18 17:24:12 +03:00
{ " halt_exits " , VCPU_STAT ( halt_exits ) } ,
{ " halt_wakeup " , VCPU_STAT ( halt_wakeup ) } ,
2008-02-20 22:30:30 +03:00
{ " hypercalls " , VCPU_STAT ( hypercalls ) } ,
2007-11-18 17:24:12 +03:00
{ " request_irq " , VCPU_STAT ( request_irq_exits ) } ,
{ " irq_exits " , VCPU_STAT ( irq_exits ) } ,
{ " host_state_reload " , VCPU_STAT ( host_state_reload ) } ,
{ " efer_reload " , VCPU_STAT ( efer_reload ) } ,
{ " fpu_reload " , VCPU_STAT ( fpu_reload ) } ,
{ " insn_emulation " , VCPU_STAT ( insn_emulation ) } ,
{ " insn_emulation_fail " , VCPU_STAT ( insn_emulation_fail ) } ,
2008-09-01 16:57:51 +04:00
{ " irq_injections " , VCPU_STAT ( irq_injections ) } ,
2008-09-26 11:30:55 +04:00
{ " nmi_injections " , VCPU_STAT ( nmi_injections ) } ,
2007-11-18 17:37:07 +03:00
{ " mmu_shadow_zapped " , VM_STAT ( mmu_shadow_zapped ) } ,
{ " mmu_pte_write " , VM_STAT ( mmu_pte_write ) } ,
{ " mmu_pte_updated " , VM_STAT ( mmu_pte_updated ) } ,
{ " mmu_pde_zapped " , VM_STAT ( mmu_pde_zapped ) } ,
{ " mmu_flooded " , VM_STAT ( mmu_flooded ) } ,
{ " mmu_recycled " , VM_STAT ( mmu_recycled ) } ,
2007-12-18 20:47:18 +03:00
{ " mmu_cache_miss " , VM_STAT ( mmu_cache_miss ) } ,
2008-09-23 20:18:39 +04:00
{ " mmu_unsync " , VM_STAT ( mmu_unsync ) } ,
2007-11-21 00:01:14 +03:00
{ " remote_tlb_flush " , VM_STAT ( remote_tlb_flush ) } ,
2008-02-23 17:44:30 +03:00
{ " largepages " , VM_STAT ( lpages ) } ,
2007-11-01 01:24:23 +03:00
{ NULL }
} ;
2010-06-10 07:27:12 +04:00
u64 __read_mostly host_xcr0 ;
2011-04-20 16:47:13 +04:00
int emulator_fix_hypercall ( struct x86_emulate_ctxt * ctxt ) ;
2010-10-14 13:22:46 +04:00
static inline void kvm_async_pf_hash_reset ( struct kvm_vcpu * vcpu )
{
int i ;
for ( i = 0 ; i < roundup_pow_of_two ( ASYNC_PF_PER_VCPU ) ; i + + )
vcpu - > arch . apf . gfns [ i ] = ~ 0 ;
}
2009-09-07 12:12:18 +04:00
static void kvm_on_user_return ( struct user_return_notifier * urn )
{
unsigned slot ;
struct kvm_shared_msrs * locals
= container_of ( urn , struct kvm_shared_msrs , urn ) ;
2009-12-18 11:48:44 +03:00
struct kvm_shared_msr_values * values ;
2009-09-07 12:12:18 +04:00
for ( slot = 0 ; slot < shared_msrs_global . nr ; + + slot ) {
2009-12-18 11:48:44 +03:00
values = & locals - > values [ slot ] ;
if ( values - > host ! = values - > curr ) {
wrmsrl ( shared_msrs_global . msrs [ slot ] , values - > host ) ;
values - > curr = values - > host ;
2009-09-07 12:12:18 +04:00
}
}
locals - > registered = false ;
user_return_notifier_unregister ( urn ) ;
}
2009-12-18 11:48:44 +03:00
static void shared_msr_update ( unsigned slot , u32 msr )
2009-09-07 12:12:18 +04:00
{
2009-12-18 11:48:44 +03:00
struct kvm_shared_msrs * smsr ;
2009-09-07 12:12:18 +04:00
u64 value ;
2009-12-18 11:48:44 +03:00
smsr = & __get_cpu_var ( shared_msrs ) ;
/* only read, and nobody should modify it at this time,
* so don ' t need lock */
if ( slot > = shared_msrs_global . nr ) {
printk ( KERN_ERR " kvm: invalid MSR slot! " ) ;
return ;
}
rdmsrl_safe ( msr , & value ) ;
smsr - > values [ slot ] . host = value ;
smsr - > values [ slot ] . curr = value ;
}
void kvm_define_shared_msr ( unsigned slot , u32 msr )
{
2009-09-07 12:12:18 +04:00
if ( slot > = shared_msrs_global . nr )
shared_msrs_global . nr = slot + 1 ;
2009-12-18 11:48:44 +03:00
shared_msrs_global . msrs [ slot ] = msr ;
/* we need ensured the shared_msr_global have been updated */
smp_wmb ( ) ;
2009-09-07 12:12:18 +04:00
}
EXPORT_SYMBOL_GPL ( kvm_define_shared_msr ) ;
static void kvm_shared_msr_cpu_online ( void )
{
unsigned i ;
for ( i = 0 ; i < shared_msrs_global . nr ; + + i )
2009-12-18 11:48:44 +03:00
shared_msr_update ( i , shared_msrs_global . msrs [ i ] ) ;
2009-09-07 12:12:18 +04:00
}
2009-12-02 13:28:47 +03:00
void kvm_set_shared_msr ( unsigned slot , u64 value , u64 mask )
2009-09-07 12:12:18 +04:00
{
struct kvm_shared_msrs * smsr = & __get_cpu_var ( shared_msrs ) ;
2009-12-18 11:48:44 +03:00
if ( ( ( value ^ smsr - > values [ slot ] . curr ) & mask ) = = 0 )
2009-09-07 12:12:18 +04:00
return ;
2009-12-18 11:48:44 +03:00
smsr - > values [ slot ] . curr = value ;
wrmsrl ( shared_msrs_global . msrs [ slot ] , value ) ;
2009-09-07 12:12:18 +04:00
if ( ! smsr - > registered ) {
smsr - > urn . on_user_return = kvm_on_user_return ;
user_return_notifier_register ( & smsr - > urn ) ;
smsr - > registered = true ;
}
}
EXPORT_SYMBOL_GPL ( kvm_set_shared_msr ) ;
2009-11-28 15:18:47 +03:00
static void drop_user_return_notifiers ( void * ignore )
{
struct kvm_shared_msrs * smsr = & __get_cpu_var ( shared_msrs ) ;
if ( smsr - > registered )
kvm_on_user_return ( & smsr - > urn ) ;
}
2007-10-29 18:09:10 +03:00
u64 kvm_get_apic_base ( struct kvm_vcpu * vcpu )
{
if ( irqchip_in_kernel ( vcpu - > kvm ) )
2007-12-13 18:50:52 +03:00
return vcpu - > arch . apic_base ;
2007-10-29 18:09:10 +03:00
else
2007-12-13 18:50:52 +03:00
return vcpu - > arch . apic_base ;
2007-10-29 18:09:10 +03:00
}
EXPORT_SYMBOL_GPL ( kvm_get_apic_base ) ;
void kvm_set_apic_base ( struct kvm_vcpu * vcpu , u64 data )
{
/* TODO: reserve bits check */
if ( irqchip_in_kernel ( vcpu - > kvm ) )
kvm_lapic_set_base ( vcpu , data ) ;
else
2007-12-13 18:50:52 +03:00
vcpu - > arch . apic_base = data ;
2007-10-29 18:09:10 +03:00
}
EXPORT_SYMBOL_GPL ( kvm_set_apic_base ) ;
2009-11-19 18:54:07 +03:00
# define EXCPT_BENIGN 0
# define EXCPT_CONTRIBUTORY 1
# define EXCPT_PF 2
static int exception_class ( int vector )
{
switch ( vector ) {
case PF_VECTOR :
return EXCPT_PF ;
case DE_VECTOR :
case TS_VECTOR :
case NP_VECTOR :
case SS_VECTOR :
case GP_VECTOR :
return EXCPT_CONTRIBUTORY ;
default :
break ;
}
return EXCPT_BENIGN ;
}
static void kvm_multiple_exception ( struct kvm_vcpu * vcpu ,
2010-04-22 14:33:13 +04:00
unsigned nr , bool has_error , u32 error_code ,
bool reinject )
2009-11-19 18:54:07 +03:00
{
u32 prev_nr ;
int class1 , class2 ;
2010-07-27 13:30:24 +04:00
kvm_make_request ( KVM_REQ_EVENT , vcpu ) ;
2009-11-19 18:54:07 +03:00
if ( ! vcpu - > arch . exception . pending ) {
queue :
vcpu - > arch . exception . pending = true ;
vcpu - > arch . exception . has_error_code = has_error ;
vcpu - > arch . exception . nr = nr ;
vcpu - > arch . exception . error_code = error_code ;
2010-05-05 18:04:41 +04:00
vcpu - > arch . exception . reinject = reinject ;
2009-11-19 18:54:07 +03:00
return ;
}
/* to check exception */
prev_nr = vcpu - > arch . exception . nr ;
if ( prev_nr = = DF_VECTOR ) {
/* triple fault -> shutdown */
2010-05-10 13:34:53 +04:00
kvm_make_request ( KVM_REQ_TRIPLE_FAULT , vcpu ) ;
2009-11-19 18:54:07 +03:00
return ;
}
class1 = exception_class ( prev_nr ) ;
class2 = exception_class ( nr ) ;
if ( ( class1 = = EXCPT_CONTRIBUTORY & & class2 = = EXCPT_CONTRIBUTORY )
| | ( class1 = = EXCPT_PF & & class2 ! = EXCPT_BENIGN ) ) {
/* generate double fault per SDM Table 5-5 */
vcpu - > arch . exception . pending = true ;
vcpu - > arch . exception . has_error_code = true ;
vcpu - > arch . exception . nr = DF_VECTOR ;
vcpu - > arch . exception . error_code = 0 ;
} else
/* replace previous exception with a new one in a hope
that instruction re - execution will regenerate lost
exception */
goto queue ;
}
2007-11-25 14:41:11 +03:00
void kvm_queue_exception ( struct kvm_vcpu * vcpu , unsigned nr )
{
2010-04-22 14:33:13 +04:00
kvm_multiple_exception ( vcpu , nr , false , 0 , false ) ;
2007-11-25 14:41:11 +03:00
}
EXPORT_SYMBOL_GPL ( kvm_queue_exception ) ;
2010-04-22 14:33:13 +04:00
void kvm_requeue_exception ( struct kvm_vcpu * vcpu , unsigned nr )
{
kvm_multiple_exception ( vcpu , nr , false , 0 , true ) ;
}
EXPORT_SYMBOL_GPL ( kvm_requeue_exception ) ;
2010-12-21 13:12:01 +03:00
void kvm_complete_insn_gp ( struct kvm_vcpu * vcpu , int err )
2007-11-25 15:04:58 +03:00
{
2010-12-21 13:12:01 +03:00
if ( err )
kvm_inject_gp ( vcpu , 0 ) ;
else
kvm_x86_ops - > skip_emulated_instruction ( vcpu ) ;
}
EXPORT_SYMBOL_GPL ( kvm_complete_insn_gp ) ;
2010-09-10 19:30:46 +04:00
2010-11-29 17:12:30 +03:00
void kvm_inject_page_fault ( struct kvm_vcpu * vcpu , struct x86_exception * fault )
2007-11-25 15:04:58 +03:00
{
+ + vcpu - > stat . pf_guest ;
2010-11-29 17:12:30 +03:00
vcpu - > arch . cr2 = fault - > address ;
kvm_queue_exception_e ( vcpu , PF_VECTOR , fault - > error_code ) ;
2007-11-25 15:04:58 +03:00
}
2011-05-26 00:06:59 +04:00
EXPORT_SYMBOL_GPL ( kvm_inject_page_fault ) ;
2007-11-25 15:04:58 +03:00
2010-11-29 17:12:30 +03:00
void kvm_propagate_fault ( struct kvm_vcpu * vcpu , struct x86_exception * fault )
2010-09-10 19:30:55 +04:00
{
2010-11-29 17:12:30 +03:00
if ( mmu_is_nested ( vcpu ) & & ! fault - > nested_page_fault )
vcpu - > arch . nested_mmu . inject_page_fault ( vcpu , fault ) ;
2010-09-10 19:30:55 +04:00
else
2010-11-29 17:12:30 +03:00
vcpu - > arch . mmu . inject_page_fault ( vcpu , fault ) ;
2010-09-10 19:30:55 +04:00
}
2008-05-15 05:52:48 +04:00
void kvm_inject_nmi ( struct kvm_vcpu * vcpu )
{
2011-09-20 14:43:14 +04:00
atomic_inc ( & vcpu - > arch . nmi_queued ) ;
kvm_make_request ( KVM_REQ_NMI , vcpu ) ;
2008-05-15 05:52:48 +04:00
}
EXPORT_SYMBOL_GPL ( kvm_inject_nmi ) ;
2007-11-25 14:41:11 +03:00
void kvm_queue_exception_e ( struct kvm_vcpu * vcpu , unsigned nr , u32 error_code )
{
2010-04-22 14:33:13 +04:00
kvm_multiple_exception ( vcpu , nr , true , error_code , false ) ;
2007-11-25 14:41:11 +03:00
}
EXPORT_SYMBOL_GPL ( kvm_queue_exception_e ) ;
2010-04-22 14:33:13 +04:00
void kvm_requeue_exception_e ( struct kvm_vcpu * vcpu , unsigned nr , u32 error_code )
{
kvm_multiple_exception ( vcpu , nr , true , error_code , true ) ;
}
EXPORT_SYMBOL_GPL ( kvm_requeue_exception_e ) ;
2009-09-01 13:03:25 +04:00
/*
* Checks if cpl < = required_cpl ; if true , return true . Otherwise queue
* a # GP and return false .
*/
bool kvm_require_cpl ( struct kvm_vcpu * vcpu , int required_cpl )
2007-11-25 14:41:11 +03:00
{
2009-09-01 13:03:25 +04:00
if ( kvm_x86_ops - > get_cpl ( vcpu ) < = required_cpl )
return true ;
kvm_queue_exception_e ( vcpu , GP_VECTOR , 0 ) ;
return false ;
2007-11-25 14:41:11 +03:00
}
2009-09-01 13:03:25 +04:00
EXPORT_SYMBOL_GPL ( kvm_require_cpl ) ;
2007-11-25 14:41:11 +03:00
2010-09-10 19:30:51 +04:00
/*
* This function will be used to read from the physical memory of the currently
* running guest . The difference to kvm_read_guest_page is that this function
* can read from guest physical or from the guest ' s guest physical memory .
*/
int kvm_read_guest_page_mmu ( struct kvm_vcpu * vcpu , struct kvm_mmu * mmu ,
gfn_t ngfn , void * data , int offset , int len ,
u32 access )
{
gfn_t real_gfn ;
gpa_t ngpa ;
ngpa = gfn_to_gpa ( ngfn ) ;
real_gfn = mmu - > translate_gpa ( vcpu , ngpa , access ) ;
if ( real_gfn = = UNMAPPED_GVA )
return - EFAULT ;
real_gfn = gpa_to_gfn ( real_gfn ) ;
return kvm_read_guest_page ( vcpu - > kvm , real_gfn , data , offset , len ) ;
}
EXPORT_SYMBOL_GPL ( kvm_read_guest_page_mmu ) ;
2010-09-10 19:30:53 +04:00
int kvm_read_nested_guest_page ( struct kvm_vcpu * vcpu , gfn_t gfn ,
void * data , int offset , int len , u32 access )
{
return kvm_read_guest_page_mmu ( vcpu , vcpu - > arch . walk_mmu , gfn ,
data , offset , len , access ) ;
}
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
/*
* Load the pae pdptrs . Return true is they are all valid .
*/
2010-09-10 19:30:57 +04:00
int load_pdptrs ( struct kvm_vcpu * vcpu , struct kvm_mmu * mmu , unsigned long cr3 )
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
{
gfn_t pdpt_gfn = cr3 > > PAGE_SHIFT ;
unsigned offset = ( ( cr3 & ( PAGE_SIZE - 1 ) ) > > 5 ) < < 2 ;
int i ;
int ret ;
2010-09-10 19:30:57 +04:00
u64 pdpte [ ARRAY_SIZE ( mmu - > pdptrs ) ] ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2010-09-10 19:30:57 +04:00
ret = kvm_read_guest_page_mmu ( vcpu , mmu , pdpt_gfn , pdpte ,
offset * sizeof ( u64 ) , sizeof ( pdpte ) ,
PFERR_USER_MASK | PFERR_WRITE_MASK ) ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
if ( ret < 0 ) {
ret = 0 ;
goto out ;
}
for ( i = 0 ; i < ARRAY_SIZE ( pdpte ) ; + + i ) {
2009-06-10 15:12:05 +04:00
if ( is_present_gpte ( pdpte [ i ] ) & &
2009-03-31 19:03:45 +04:00
( pdpte [ i ] & vcpu - > arch . mmu . rsvd_bits_mask [ 0 ] [ 2 ] ) ) {
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
ret = 0 ;
goto out ;
}
}
ret = 1 ;
2010-09-10 19:30:57 +04:00
memcpy ( mmu - > pdptrs , pdpte , sizeof ( mmu - > pdptrs ) ) ;
2009-05-31 23:58:47 +04:00
__set_bit ( VCPU_EXREG_PDPTR ,
( unsigned long * ) & vcpu - > arch . regs_avail ) ;
__set_bit ( VCPU_EXREG_PDPTR ,
( unsigned long * ) & vcpu - > arch . regs_dirty ) ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
out :
return ret ;
}
2008-02-07 15:47:43 +03:00
EXPORT_SYMBOL_GPL ( load_pdptrs ) ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2007-11-21 03:57:59 +03:00
static bool pdptrs_changed ( struct kvm_vcpu * vcpu )
{
2010-09-10 19:30:57 +04:00
u64 pdpte [ ARRAY_SIZE ( vcpu - > arch . walk_mmu - > pdptrs ) ] ;
2007-11-21 03:57:59 +03:00
bool changed = true ;
2010-09-10 19:30:53 +04:00
int offset ;
gfn_t gfn ;
2007-11-21 03:57:59 +03:00
int r ;
if ( is_long_mode ( vcpu ) | | ! is_pae ( vcpu ) )
return false ;
2009-05-31 23:58:47 +04:00
if ( ! test_bit ( VCPU_EXREG_PDPTR ,
( unsigned long * ) & vcpu - > arch . regs_avail ) )
return true ;
2010-12-05 18:30:00 +03:00
gfn = ( kvm_read_cr3 ( vcpu ) & ~ 31u ) > > PAGE_SHIFT ;
offset = ( kvm_read_cr3 ( vcpu ) & ~ 31u ) & ( PAGE_SIZE - 1 ) ;
2010-09-10 19:30:53 +04:00
r = kvm_read_nested_guest_page ( vcpu , gfn , pdpte , offset , sizeof ( pdpte ) ,
PFERR_USER_MASK | PFERR_WRITE_MASK ) ;
2007-11-21 03:57:59 +03:00
if ( r < 0 )
goto out ;
2010-09-10 19:30:57 +04:00
changed = memcmp ( pdpte , vcpu - > arch . walk_mmu - > pdptrs , sizeof ( pdpte ) ) ! = 0 ;
2007-11-21 03:57:59 +03:00
out :
return changed ;
}
2010-06-10 18:02:14 +04:00
int kvm_set_cr0 ( struct kvm_vcpu * vcpu , unsigned long cr0 )
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
{
2010-05-12 12:40:42 +04:00
unsigned long old_cr0 = kvm_read_cr0 ( vcpu ) ;
unsigned long update_bits = X86_CR0_PG | X86_CR0_WP |
X86_CR0_CD | X86_CR0_NW ;
2010-01-06 20:10:22 +03:00
cr0 | = X86_CR0_ET ;
2010-01-21 16:28:46 +03:00
# ifdef CONFIG_X86_64
2010-04-28 20:15:31 +04:00
if ( cr0 & 0xffffffff00000000UL )
return 1 ;
2010-01-21 16:28:46 +03:00
# endif
cr0 & = ~ CR0_RESERVED_BITS ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2010-04-28 20:15:31 +04:00
if ( ( cr0 & X86_CR0_NW ) & & ! ( cr0 & X86_CR0_CD ) )
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2010-04-28 20:15:31 +04:00
if ( ( cr0 & X86_CR0_PG ) & & ! ( cr0 & X86_CR0_PE ) )
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
if ( ! is_paging ( vcpu ) & & ( cr0 & X86_CR0_PG ) ) {
# ifdef CONFIG_X86_64
2010-01-21 16:31:50 +03:00
if ( ( vcpu - > arch . efer & EFER_LME ) ) {
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
int cs_db , cs_l ;
2010-04-28 20:15:31 +04:00
if ( ! is_pae ( vcpu ) )
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
kvm_x86_ops - > get_cs_db_l_bits ( vcpu , & cs_db , & cs_l ) ;
2010-04-28 20:15:31 +04:00
if ( cs_l )
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
} else
# endif
2010-09-10 19:30:57 +04:00
if ( is_pae ( vcpu ) & & ! load_pdptrs ( vcpu , vcpu - > arch . walk_mmu ,
2010-12-05 18:30:00 +03:00
kvm_read_cr3 ( vcpu ) ) )
2010-04-28 20:15:31 +04:00
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
}
2012-07-02 05:18:48 +04:00
if ( ! ( cr0 & X86_CR0_PG ) & & kvm_read_cr4_bits ( vcpu , X86_CR4_PCIDE ) )
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
kvm_x86_ops - > set_cr0 ( vcpu , cr0 ) ;
2011-02-21 06:21:30 +03:00
if ( ( cr0 ^ old_cr0 ) & X86_CR0_PG ) {
2010-11-12 09:47:01 +03:00
kvm_clear_async_pf_completion_queue ( vcpu ) ;
2011-02-21 06:21:30 +03:00
kvm_async_pf_hash_reset ( vcpu ) ;
}
2010-11-12 09:47:01 +03:00
2010-05-12 12:40:42 +04:00
if ( ( cr0 ^ old_cr0 ) & update_bits )
kvm_mmu_reset_context ( vcpu ) ;
2010-04-28 20:15:31 +04:00
return 0 ;
}
2008-02-24 12:20:43 +03:00
EXPORT_SYMBOL_GPL ( kvm_set_cr0 ) ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2008-02-24 12:20:43 +03:00
void kvm_lmsw ( struct kvm_vcpu * vcpu , unsigned long msw )
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
{
2010-06-10 18:02:14 +04:00
( void ) kvm_set_cr0 ( vcpu , kvm_read_cr0_bits ( vcpu , ~ 0x0eul ) | ( msw & 0x0f ) ) ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
}
2008-02-24 12:20:43 +03:00
EXPORT_SYMBOL_GPL ( kvm_lmsw ) ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2010-06-10 07:27:12 +04:00
int __kvm_set_xcr ( struct kvm_vcpu * vcpu , u32 index , u64 xcr )
{
u64 xcr0 ;
/* Only support XCR_XFEATURE_ENABLED_MASK(xcr0) now */
if ( index ! = XCR_XFEATURE_ENABLED_MASK )
return 1 ;
xcr0 = xcr ;
if ( kvm_x86_ops - > get_cpl ( vcpu ) ! = 0 )
return 1 ;
if ( ! ( xcr0 & XSTATE_FP ) )
return 1 ;
if ( ( xcr0 & XSTATE_YMM ) & & ! ( xcr0 & XSTATE_SSE ) )
return 1 ;
if ( xcr0 & ~ host_xcr0 )
return 1 ;
vcpu - > arch . xcr0 = xcr0 ;
vcpu - > guest_xcr0_loaded = 0 ;
return 0 ;
}
int kvm_set_xcr ( struct kvm_vcpu * vcpu , u32 index , u64 xcr )
{
if ( __kvm_set_xcr ( vcpu , index , xcr ) ) {
kvm_inject_gp ( vcpu , 0 ) ;
return 1 ;
}
return 0 ;
}
EXPORT_SYMBOL_GPL ( kvm_set_xcr ) ;
2010-06-10 18:02:15 +04:00
int kvm_set_cr4 ( struct kvm_vcpu * vcpu , unsigned long cr4 )
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
{
2009-12-07 13:16:48 +03:00
unsigned long old_cr4 = kvm_read_cr4 ( vcpu ) ;
2011-06-03 07:13:42 +04:00
unsigned long pdptr_bits = X86_CR4_PGE | X86_CR4_PSE |
X86_CR4_PAE | X86_CR4_SMEP ;
2010-04-28 20:15:31 +04:00
if ( cr4 & CR4_RESERVED_BITS )
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2010-06-10 07:27:12 +04:00
if ( ! guest_cpuid_has_xsave ( vcpu ) & & ( cr4 & X86_CR4_OSXSAVE ) )
return 1 ;
2011-06-03 07:13:42 +04:00
if ( ! guest_cpuid_has_smep ( vcpu ) & & ( cr4 & X86_CR4_SMEP ) )
return 1 ;
2011-06-14 16:10:18 +04:00
if ( ! guest_cpuid_has_fsgsbase ( vcpu ) & & ( cr4 & X86_CR4_RDWRGSFS ) )
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
if ( is_long_mode ( vcpu ) ) {
2010-04-28 20:15:31 +04:00
if ( ! ( cr4 & X86_CR4_PAE ) )
return 1 ;
2009-05-24 23:19:00 +04:00
} else if ( is_paging ( vcpu ) & & ( cr4 & X86_CR4_PAE )
& & ( ( cr4 ^ old_cr4 ) & pdptr_bits )
2010-12-05 18:30:00 +03:00
& & ! load_pdptrs ( vcpu , vcpu - > arch . walk_mmu ,
kvm_read_cr3 ( vcpu ) ) )
2010-04-28 20:15:31 +04:00
return 1 ;
2012-07-02 05:18:48 +04:00
if ( ( cr4 & X86_CR4_PCIDE ) & & ! ( old_cr4 & X86_CR4_PCIDE ) ) {
if ( ! guest_cpuid_has_pcid ( vcpu ) )
return 1 ;
/* PCID can not be enabled when cr3[11:0]!=000H or EFER.LMA=0 */
if ( ( kvm_read_cr3 ( vcpu ) & X86_CR3_PCID_MASK ) | | ! is_long_mode ( vcpu ) )
return 1 ;
}
2011-05-26 00:03:24 +04:00
if ( kvm_x86_ops - > set_cr4 ( vcpu , cr4 ) )
2010-04-28 20:15:31 +04:00
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2012-07-02 05:18:48 +04:00
if ( ( ( cr4 ^ old_cr4 ) & pdptr_bits ) | |
( ! ( cr4 & X86_CR4_PCIDE ) & & ( old_cr4 & X86_CR4_PCIDE ) ) )
2010-05-12 12:40:42 +04:00
kvm_mmu_reset_context ( vcpu ) ;
2010-04-28 20:15:31 +04:00
2010-06-10 07:27:12 +04:00
if ( ( cr4 ^ old_cr4 ) & X86_CR4_OSXSAVE )
2011-11-23 18:30:32 +04:00
kvm_update_cpuid ( vcpu ) ;
2010-06-10 07:27:12 +04:00
2010-04-28 20:15:31 +04:00
return 0 ;
}
2008-02-24 12:20:43 +03:00
EXPORT_SYMBOL_GPL ( kvm_set_cr4 ) ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2010-06-10 18:02:16 +04:00
int kvm_set_cr3 ( struct kvm_vcpu * vcpu , unsigned long cr3 )
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
{
2010-12-05 18:30:00 +03:00
if ( cr3 = = kvm_read_cr3 ( vcpu ) & & ! pdptrs_changed ( vcpu ) ) {
2008-09-23 20:18:34 +04:00
kvm_mmu_sync_roots ( vcpu ) ;
2007-11-21 03:57:59 +03:00
kvm_mmu_flush_tlb ( vcpu ) ;
2010-04-28 20:15:31 +04:00
return 0 ;
2007-11-21 03:57:59 +03:00
}
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
if ( is_long_mode ( vcpu ) ) {
2012-07-02 05:18:48 +04:00
if ( kvm_read_cr4 ( vcpu ) & X86_CR4_PCIDE ) {
if ( cr3 & CR3_PCID_ENABLED_RESERVED_BITS )
return 1 ;
} else
if ( cr3 & CR3_L_MODE_RESERVED_BITS )
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
} else {
if ( is_pae ( vcpu ) ) {
2010-04-28 20:15:31 +04:00
if ( cr3 & CR3_PAE_RESERVED_BITS )
return 1 ;
2010-09-10 19:30:57 +04:00
if ( is_paging ( vcpu ) & &
! load_pdptrs ( vcpu , vcpu - > arch . walk_mmu , cr3 ) )
2010-04-28 20:15:31 +04:00
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
}
/*
* We don ' t check reserved bits in nonpae mode , because
* this isn ' t enforced , and VMware depends on this .
*/
}
/*
* Does the new cr3 value map to physical memory ? ( Note , we
* catch an invalid cr3 even in real - mode , because it would
* cause trouble later on when we turn on paging anyway . )
*
* A real CPU would silently accept an invalid cr3 and would
* attempt to use it - with largely undefined ( and often hard
* to debug ) behavior on the guest side .
*/
if ( unlikely ( ! gfn_to_memslot ( vcpu - > kvm , cr3 > > PAGE_SHIFT ) ) )
2010-04-28 20:15:31 +04:00
return 1 ;
vcpu - > arch . cr3 = cr3 ;
2010-12-05 19:56:11 +03:00
__set_bit ( VCPU_EXREG_CR3 , ( ulong * ) & vcpu - > arch . regs_avail ) ;
2010-04-28 20:15:31 +04:00
vcpu - > arch . mmu . new_cr3 ( vcpu ) ;
return 0 ;
}
2008-02-24 12:20:43 +03:00
EXPORT_SYMBOL_GPL ( kvm_set_cr3 ) ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2010-12-21 13:12:00 +03:00
int kvm_set_cr8 ( struct kvm_vcpu * vcpu , unsigned long cr8 )
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
{
2010-04-28 20:15:31 +04:00
if ( cr8 & CR8_RESERVED_BITS )
return 1 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
if ( irqchip_in_kernel ( vcpu - > kvm ) )
kvm_lapic_set_tpr ( vcpu , cr8 ) ;
else
2007-12-13 18:50:52 +03:00
vcpu - > arch . cr8 = cr8 ;
2010-04-28 20:15:31 +04:00
return 0 ;
}
2008-02-24 12:20:43 +03:00
EXPORT_SYMBOL_GPL ( kvm_set_cr8 ) ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2008-02-24 12:20:43 +03:00
unsigned long kvm_get_cr8 ( struct kvm_vcpu * vcpu )
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
{
if ( irqchip_in_kernel ( vcpu - > kvm ) )
return kvm_lapic_get_cr8 ( vcpu ) ;
else
2007-12-13 18:50:52 +03:00
return vcpu - > arch . cr8 ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
}
2008-02-24 12:20:43 +03:00
EXPORT_SYMBOL_GPL ( kvm_get_cr8 ) ;
KVM: Portability: Move control register helper functions to x86.c
This patch moves the definitions of CR0_RESERVED_BITS,
CR4_RESERVED_BITS, and CR8_RESERVED_BITS along with the following
functions from kvm_main.c to x86.c:
set_cr0(), set_cr3(), set_cr4(), set_cr8(), get_cr8(), lmsw(),
load_pdptrs()
The static function wrapper inject_gp is duplicated in kvm_main.c and
x86.c for now, the version in kvm_main.c should disappear once the last
user of it is gone too.
The function load_pdptrs is no longer static, and now defined in x86.h
for the time being, until the last user of it is gone from kvm_main.c.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-29 18:09:35 +03:00
2010-04-28 20:15:32 +04:00
static int __kvm_set_dr ( struct kvm_vcpu * vcpu , int dr , unsigned long val )
2010-04-13 11:05:23 +04:00
{
switch ( dr ) {
case 0 . . . 3 :
vcpu - > arch . db [ dr ] = val ;
if ( ! ( vcpu - > guest_debug & KVM_GUESTDBG_USE_HW_BP ) )
vcpu - > arch . eff_db [ dr ] = val ;
break ;
case 4 :
2010-04-28 20:15:32 +04:00
if ( kvm_read_cr4_bits ( vcpu , X86_CR4_DE ) )
return 1 ; /* #UD */
2010-04-13 11:05:23 +04:00
/* fall through */
case 6 :
2010-04-28 20:15:32 +04:00
if ( val & 0xffffffff00000000ULL )
return - 1 ; /* #GP */
2010-04-13 11:05:23 +04:00
vcpu - > arch . dr6 = ( val & DR6_VOLATILE ) | DR6_FIXED_1 ;
break ;
case 5 :
2010-04-28 20:15:32 +04:00
if ( kvm_read_cr4_bits ( vcpu , X86_CR4_DE ) )
return 1 ; /* #UD */
2010-04-13 11:05:23 +04:00
/* fall through */
default : /* 7 */
2010-04-28 20:15:32 +04:00
if ( val & 0xffffffff00000000ULL )
return - 1 ; /* #GP */
2010-04-13 11:05:23 +04:00
vcpu - > arch . dr7 = ( val & DR7_VOLATILE ) | DR7_FIXED_1 ;
if ( ! ( vcpu - > guest_debug & KVM_GUESTDBG_USE_HW_BP ) ) {
kvm_x86_ops - > set_dr7 ( vcpu , vcpu - > arch . dr7 ) ;
vcpu - > arch . switch_db_regs = ( val & DR7_BP_EN_MASK ) ;
}
break ;
}
return 0 ;
}
2010-04-28 20:15:32 +04:00
int kvm_set_dr ( struct kvm_vcpu * vcpu , int dr , unsigned long val )
{
int res ;
res = __kvm_set_dr ( vcpu , dr , val ) ;
if ( res > 0 )
kvm_queue_exception ( vcpu , UD_VECTOR ) ;
else if ( res < 0 )
kvm_inject_gp ( vcpu , 0 ) ;
return res ;
}
2010-04-13 11:05:23 +04:00
EXPORT_SYMBOL_GPL ( kvm_set_dr ) ;
2010-04-28 20:15:32 +04:00
static int _kvm_get_dr ( struct kvm_vcpu * vcpu , int dr , unsigned long * val )
2010-04-13 11:05:23 +04:00
{
switch ( dr ) {
case 0 . . . 3 :
* val = vcpu - > arch . db [ dr ] ;
break ;
case 4 :
2010-04-28 20:15:32 +04:00
if ( kvm_read_cr4_bits ( vcpu , X86_CR4_DE ) )
2010-04-13 11:05:23 +04:00
return 1 ;
/* fall through */
case 6 :
* val = vcpu - > arch . dr6 ;
break ;
case 5 :
2010-04-28 20:15:32 +04:00
if ( kvm_read_cr4_bits ( vcpu , X86_CR4_DE ) )
2010-04-13 11:05:23 +04:00
return 1 ;
/* fall through */
default : /* 7 */
* val = vcpu - > arch . dr7 ;
break ;
}
return 0 ;
}
2010-04-28 20:15:32 +04:00
int kvm_get_dr ( struct kvm_vcpu * vcpu , int dr , unsigned long * val )
{
if ( _kvm_get_dr ( vcpu , dr , val ) ) {
kvm_queue_exception ( vcpu , UD_VECTOR ) ;
return 1 ;
}
return 0 ;
}
2010-04-13 11:05:23 +04:00
EXPORT_SYMBOL_GPL ( kvm_get_dr ) ;
2011-11-10 16:57:23 +04:00
bool kvm_rdpmc ( struct kvm_vcpu * vcpu )
{
u32 ecx = kvm_register_read ( vcpu , VCPU_REGS_RCX ) ;
u64 data ;
int err ;
err = kvm_pmu_read_pmc ( vcpu , ecx , & data ) ;
if ( err )
return err ;
kvm_register_write ( vcpu , VCPU_REGS_RAX , ( u32 ) data ) ;
kvm_register_write ( vcpu , VCPU_REGS_RDX , data > > 32 ) ;
return err ;
}
EXPORT_SYMBOL_GPL ( kvm_rdpmc ) ;
2007-10-10 19:16:19 +04:00
/*
* List of msr numbers which we expose to userspace through KVM_GET_MSRS
* and KVM_SET_MSRS , and KVM_GET_MSR_INDEX_LIST .
*
* This list is modified at module load time to reflect the
2009-10-06 21:24:50 +04:00
* capabilities of the host cpu . This capabilities test skips MSRs that are
* kvm - specific . Those are put in the beginning of the list .
2007-10-10 19:16:19 +04:00
*/
2009-10-06 21:24:50 +04:00
2012-08-01 18:01:42 +04:00
# define KVM_SAVE_MSRS_BEGIN 10
2007-10-10 19:16:19 +04:00
static u32 msrs_to_save [ ] = {
2009-10-06 21:24:50 +04:00
MSR_KVM_SYSTEM_TIME , MSR_KVM_WALL_CLOCK ,
2010-05-11 20:17:41 +04:00
MSR_KVM_SYSTEM_TIME_NEW , MSR_KVM_WALL_CLOCK_NEW ,
2010-01-17 16:51:22 +03:00
HV_X64_MSR_GUEST_OS_ID , HV_X64_MSR_HYPERCALL ,
2011-07-11 23:28:14 +04:00
HV_X64_MSR_APIC_ASSIST_PAGE , MSR_KVM_ASYNC_PF_EN , MSR_KVM_STEAL_TIME ,
2012-06-24 20:25:07 +04:00
MSR_KVM_PV_EOI_EN ,
2007-10-10 19:16:19 +04:00
MSR_IA32_SYSENTER_CS , MSR_IA32_SYSENTER_ESP , MSR_IA32_SYSENTER_EIP ,
2010-07-17 17:03:26 +04:00
MSR_STAR ,
2007-10-10 19:16:19 +04:00
# ifdef CONFIG_X86_64
MSR_CSTAR , MSR_KERNEL_GS_BASE , MSR_SYSCALL_MASK , MSR_LSTAR ,
# endif
2010-09-01 11:23:35 +04:00
MSR_IA32_TSC , MSR_IA32_CR_PAT , MSR_VM_HSAVE_PA
2007-10-10 19:16:19 +04:00
} ;
static unsigned num_msrs_to_save ;
static u32 emulated_msrs [ ] = {
2011-09-22 12:55:52 +04:00
MSR_IA32_TSCDEADLINE ,
2007-10-10 19:16:19 +04:00
MSR_IA32_MISC_ENABLE ,
2010-07-07 15:09:38 +04:00
MSR_IA32_MCG_STATUS ,
MSR_IA32_MCG_CTL ,
2007-10-10 19:16:19 +04:00
} ;
2010-05-06 13:38:43 +04:00
static int set_efer ( struct kvm_vcpu * vcpu , u64 efer )
2007-10-30 20:44:17 +03:00
{
2010-05-12 12:40:42 +04:00
u64 old_efer = vcpu - > arch . efer ;
2010-05-06 13:38:43 +04:00
if ( efer & efer_reserved_bits )
return 1 ;
2007-10-30 20:44:17 +03:00
if ( is_paging ( vcpu )
2010-05-06 13:38:43 +04:00
& & ( vcpu - > arch . efer & EFER_LME ) ! = ( efer & EFER_LME ) )
return 1 ;
2007-10-30 20:44:17 +03:00
2009-02-02 18:23:51 +03:00
if ( efer & EFER_FFXSR ) {
struct kvm_cpuid_entry2 * feat ;
feat = kvm_find_cpuid_entry ( vcpu , 0x80000001 , 0 ) ;
2010-05-06 13:38:43 +04:00
if ( ! feat | | ! ( feat - > edx & bit ( X86_FEATURE_FXSR_OPT ) ) )
return 1 ;
2009-02-02 18:23:51 +03:00
}
2008-11-25 22:17:11 +03:00
if ( efer & EFER_SVME ) {
struct kvm_cpuid_entry2 * feat ;
feat = kvm_find_cpuid_entry ( vcpu , 0x80000001 , 0 ) ;
2010-05-06 13:38:43 +04:00
if ( ! feat | | ! ( feat - > ecx & bit ( X86_FEATURE_SVM ) ) )
return 1 ;
2008-11-25 22:17:11 +03:00
}
2007-10-30 20:44:17 +03:00
efer & = ~ EFER_LMA ;
2010-01-21 16:31:50 +03:00
efer | = vcpu - > arch . efer & EFER_LMA ;
2007-10-30 20:44:17 +03:00
2010-05-12 12:40:40 +04:00
kvm_x86_ops - > set_efer ( vcpu , efer ) ;
2009-03-31 12:31:54 +04:00
vcpu - > arch . mmu . base_role . nxe = ( efer & EFER_NX ) & & ! tdp_enabled ;
2010-05-06 13:38:43 +04:00
2010-05-12 12:40:42 +04:00
/* Update reserved bits */
if ( ( efer ^ old_efer ) & EFER_NX )
kvm_mmu_reset_context ( vcpu ) ;
2010-05-06 13:38:43 +04:00
return 0 ;
2007-10-30 20:44:17 +03:00
}
2008-01-31 16:57:37 +03:00
void kvm_enable_efer_bits ( u64 mask )
{
efer_reserved_bits & = ~ mask ;
}
EXPORT_SYMBOL_GPL ( kvm_enable_efer_bits ) ;
2007-10-30 20:44:17 +03:00
/*
* Writes msr value into into the appropriate " register " .
* Returns 0 on success , non - 0 otherwise .
* Assumes vcpu_load ( ) was already called .
*/
int kvm_set_msr ( struct kvm_vcpu * vcpu , u32 msr_index , u64 data )
{
return kvm_x86_ops - > set_msr ( vcpu , msr_index , data ) ;
}
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
/*
* Adapt set_msr ( ) to msr_io ( ) ' s calling convention
*/
static int do_set_msr ( struct kvm_vcpu * vcpu , unsigned index , u64 * data )
{
return kvm_set_msr ( vcpu , index , * data ) ;
}
2008-02-15 22:52:47 +03:00
static void kvm_write_wall_clock ( struct kvm * kvm , gpa_t wall_clock )
{
2010-05-04 16:00:37 +04:00
int version ;
int r ;
2008-06-03 18:17:31 +04:00
struct pvclock_wall_clock wc ;
2010-01-27 14:13:49 +03:00
struct timespec boot ;
2008-02-15 22:52:47 +03:00
if ( ! wall_clock )
return ;
2010-05-04 16:00:37 +04:00
r = kvm_read_guest ( kvm , wall_clock , & version , sizeof ( version ) ) ;
if ( r )
return ;
if ( version & 1 )
+ + version ; /* first time write, random junk */
+ + version ;
2008-02-15 22:52:47 +03:00
kvm_write_guest ( kvm , wall_clock , & version , sizeof ( version ) ) ;
2008-06-03 18:17:31 +04:00
/*
* The guest calculates current wall clock time by adding
2010-09-19 04:38:14 +04:00
* system time ( updated by kvm_guest_time_update below ) to the
2008-06-03 18:17:31 +04:00
* wall clock specified here . guest system time equals host
* system time for us , thus we must fill in host boot time here .
*/
2010-01-27 14:13:49 +03:00
getboottime ( & boot ) ;
2008-06-03 18:17:31 +04:00
2012-07-20 20:44:24 +04:00
if ( kvm - > arch . kvmclock_offset ) {
struct timespec ts = ns_to_timespec ( kvm - > arch . kvmclock_offset ) ;
boot = timespec_sub ( boot , ts ) ;
}
2008-06-03 18:17:31 +04:00
wc . sec = boot . tv_sec ;
wc . nsec = boot . tv_nsec ;
wc . version = version ;
2008-02-15 22:52:47 +03:00
kvm_write_guest ( kvm , wall_clock , & wc , sizeof ( wc ) ) ;
version + + ;
kvm_write_guest ( kvm , wall_clock , & version , sizeof ( version ) ) ;
}
2008-06-03 18:17:31 +04:00
static uint32_t div_frac ( uint32_t dividend , uint32_t divisor )
{
uint32_t quotient , remainder ;
/* Don't try to replace with do_div(), this one calculates
* " (dividend << 32) / divisor " */
__asm__ ( " divl %4 "
: " =a " ( quotient ) , " =d " ( remainder )
: " 0 " ( 0 ) , " 1 " ( dividend ) , " r " ( divisor ) ) ;
return quotient ;
}
2010-09-19 04:38:13 +04:00
static void kvm_get_time_scale ( uint32_t scaled_khz , uint32_t base_khz ,
s8 * pshift , u32 * pmultiplier )
2008-06-03 18:17:31 +04:00
{
2010-09-19 04:38:13 +04:00
uint64_t scaled64 ;
2008-06-03 18:17:31 +04:00
int32_t shift = 0 ;
uint64_t tps64 ;
uint32_t tps32 ;
2010-09-19 04:38:13 +04:00
tps64 = base_khz * 1000LL ;
scaled64 = scaled_khz * 1000LL ;
2010-09-26 15:00:53 +04:00
while ( tps64 > scaled64 * 2 | | tps64 & 0xffffffff00000000ULL ) {
2008-06-03 18:17:31 +04:00
tps64 > > = 1 ;
shift - - ;
}
tps32 = ( uint32_t ) tps64 ;
2010-09-26 15:00:53 +04:00
while ( tps32 < = scaled64 | | scaled64 & 0xffffffff00000000ULL ) {
if ( scaled64 & 0xffffffff00000000ULL | | tps32 & 0x80000000 )
2010-09-19 04:38:13 +04:00
scaled64 > > = 1 ;
else
tps32 < < = 1 ;
2008-06-03 18:17:31 +04:00
shift + + ;
}
2010-09-19 04:38:13 +04:00
* pshift = shift ;
* pmultiplier = div_frac ( scaled64 , tps32 ) ;
2008-06-03 18:17:31 +04:00
2010-09-19 04:38:13 +04:00
pr_debug ( " %s: base_khz %u => %u, shift %d, mul %u \n " ,
__func__ , base_khz , scaled_khz , shift , * pmultiplier ) ;
2008-06-03 18:17:31 +04:00
}
2010-08-20 12:07:25 +04:00
static inline u64 get_kernel_ns ( void )
{
struct timespec ts ;
WARN_ON ( preemptible ( ) ) ;
ktime_get_ts ( & ts ) ;
monotonic_to_bootbased ( & ts ) ;
return timespec_to_ns ( & ts ) ;
2008-06-03 18:17:31 +04:00
}
2009-02-04 19:52:04 +03:00
static DEFINE_PER_CPU ( unsigned long , cpu_tsc_khz ) ;
2010-09-19 04:38:15 +04:00
unsigned long max_tsc_khz ;
2009-02-04 19:52:04 +03:00
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
static inline u64 nsec_to_cycles ( struct kvm_vcpu * vcpu , u64 nsec )
2010-08-20 12:07:21 +04:00
{
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
return pvclock_scale_delta ( nsec , vcpu - > arch . virtual_tsc_mult ,
vcpu - > arch . virtual_tsc_shift ) ;
2010-08-20 12:07:21 +04:00
}
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
static u32 adjust_tsc_khz ( u32 khz , s32 ppm )
2011-03-25 11:44:47 +03:00
{
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
u64 v = ( u64 ) khz * ( 1000000 + ppm ) ;
do_div ( v , 1000000 ) ;
return v ;
2011-03-25 11:44:47 +03:00
}
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
static void kvm_set_tsc_khz ( struct kvm_vcpu * vcpu , u32 this_tsc_khz )
2010-08-20 12:07:25 +04:00
{
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
u32 thresh_lo , thresh_hi ;
int use_scaling = 0 ;
2010-08-26 14:38:03 +04:00
2010-09-19 04:38:15 +04:00
/* Compute a scale to convert nanoseconds in TSC cycles */
kvm_get_time_scale ( this_tsc_khz , NSEC_PER_SEC / 1000 ,
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
& vcpu - > arch . virtual_tsc_shift ,
& vcpu - > arch . virtual_tsc_mult ) ;
vcpu - > arch . virtual_tsc_khz = this_tsc_khz ;
/*
* Compute the variation in TSC rate which is acceptable
* within the range of tolerance and decide if the
* rate being applied is within that bounds of the hardware
* rate . If so , no scaling or compensation need be done .
*/
thresh_lo = adjust_tsc_khz ( tsc_khz , - tsc_tolerance_ppm ) ;
thresh_hi = adjust_tsc_khz ( tsc_khz , tsc_tolerance_ppm ) ;
if ( this_tsc_khz < thresh_lo | | this_tsc_khz > thresh_hi ) {
pr_debug ( " kvm: requested TSC rate %u falls outside tolerance [%u,%u] \n " , this_tsc_khz , thresh_lo , thresh_hi ) ;
use_scaling = 1 ;
}
kvm_x86_ops - > set_tsc_khz ( vcpu , this_tsc_khz , use_scaling ) ;
2010-09-19 04:38:15 +04:00
}
static u64 compute_guest_tsc ( struct kvm_vcpu * vcpu , s64 kernel_ns )
{
2012-02-03 21:43:57 +04:00
u64 tsc = pvclock_scale_delta ( kernel_ns - vcpu - > arch . this_tsc_nsec ,
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
vcpu - > arch . virtual_tsc_mult ,
vcpu - > arch . virtual_tsc_shift ) ;
2012-02-03 21:43:57 +04:00
tsc + = vcpu - > arch . this_tsc_write ;
2010-09-19 04:38:15 +04:00
return tsc ;
}
2010-08-20 12:07:17 +04:00
void kvm_write_tsc ( struct kvm_vcpu * vcpu , u64 data )
{
struct kvm * kvm = vcpu - > kvm ;
2010-08-20 12:07:20 +04:00
u64 offset , ns , elapsed ;
2010-08-20 12:07:17 +04:00
unsigned long flags ;
2012-03-09 01:46:57 +04:00
s64 usdiff ;
2010-08-20 12:07:17 +04:00
2011-02-04 12:49:11 +03:00
raw_spin_lock_irqsave ( & kvm - > arch . tsc_write_lock , flags ) ;
2011-03-25 11:44:50 +03:00
offset = kvm_x86_ops - > compute_tsc_offset ( vcpu , data ) ;
2010-08-20 12:07:25 +04:00
ns = get_kernel_ns ( ) ;
2010-08-20 12:07:20 +04:00
elapsed = ns - kvm - > arch . last_tsc_nsec ;
KVM: Improve TSC offset matching
There are a few improvements that can be made to the TSC offset
matching code. First, we don't need to call the 128-bit multiply
(especially on a constant number), the code works much nicer to
do computation in nanosecond units.
Second, the way everything is setup with software TSC rate scaling,
we currently have per-cpu rates. Obviously this isn't too desirable
to use in practice, but if for some reason we do change the rate of
all VCPUs at runtime, then reset the TSCs, we will only want to
match offsets for VCPUs running at the same rate.
Finally, for the case where we have an unstable host TSC, but
rate scaling is being done in hardware, we should call the platform
code to compute the TSC offset, so the math is reorganized to recompute
the base instead, then transform the base into an offset using the
existing API.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
KVM: Fix 64-bit division in kvm_write_tsc()
Breaks i386 build.
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:51 +04:00
/* n.b - signed multiplication and division required */
2012-03-09 01:46:57 +04:00
usdiff = data - kvm - > arch . last_tsc_write ;
KVM: Improve TSC offset matching
There are a few improvements that can be made to the TSC offset
matching code. First, we don't need to call the 128-bit multiply
(especially on a constant number), the code works much nicer to
do computation in nanosecond units.
Second, the way everything is setup with software TSC rate scaling,
we currently have per-cpu rates. Obviously this isn't too desirable
to use in practice, but if for some reason we do change the rate of
all VCPUs at runtime, then reset the TSCs, we will only want to
match offsets for VCPUs running at the same rate.
Finally, for the case where we have an unstable host TSC, but
rate scaling is being done in hardware, we should call the platform
code to compute the TSC offset, so the math is reorganized to recompute
the base instead, then transform the base into an offset using the
existing API.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
KVM: Fix 64-bit division in kvm_write_tsc()
Breaks i386 build.
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:51 +04:00
# ifdef CONFIG_X86_64
2012-03-09 01:46:57 +04:00
usdiff = ( usdiff * 1000 ) / vcpu - > arch . virtual_tsc_khz ;
KVM: Improve TSC offset matching
There are a few improvements that can be made to the TSC offset
matching code. First, we don't need to call the 128-bit multiply
(especially on a constant number), the code works much nicer to
do computation in nanosecond units.
Second, the way everything is setup with software TSC rate scaling,
we currently have per-cpu rates. Obviously this isn't too desirable
to use in practice, but if for some reason we do change the rate of
all VCPUs at runtime, then reset the TSCs, we will only want to
match offsets for VCPUs running at the same rate.
Finally, for the case where we have an unstable host TSC, but
rate scaling is being done in hardware, we should call the platform
code to compute the TSC offset, so the math is reorganized to recompute
the base instead, then transform the base into an offset using the
existing API.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
KVM: Fix 64-bit division in kvm_write_tsc()
Breaks i386 build.
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:51 +04:00
# else
/* do_div() only does unsigned */
asm ( " idivl %2; xor %%edx, %%edx "
2012-03-09 01:46:57 +04:00
: " =A " ( usdiff )
: " A " ( usdiff * 1000 ) , " rm " ( vcpu - > arch . virtual_tsc_khz ) ) ;
KVM: Improve TSC offset matching
There are a few improvements that can be made to the TSC offset
matching code. First, we don't need to call the 128-bit multiply
(especially on a constant number), the code works much nicer to
do computation in nanosecond units.
Second, the way everything is setup with software TSC rate scaling,
we currently have per-cpu rates. Obviously this isn't too desirable
to use in practice, but if for some reason we do change the rate of
all VCPUs at runtime, then reset the TSCs, we will only want to
match offsets for VCPUs running at the same rate.
Finally, for the case where we have an unstable host TSC, but
rate scaling is being done in hardware, we should call the platform
code to compute the TSC offset, so the math is reorganized to recompute
the base instead, then transform the base into an offset using the
existing API.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
KVM: Fix 64-bit division in kvm_write_tsc()
Breaks i386 build.
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:51 +04:00
# endif
2012-03-09 01:46:57 +04:00
do_div ( elapsed , 1000 ) ;
usdiff - = elapsed ;
if ( usdiff < 0 )
usdiff = - usdiff ;
2010-08-20 12:07:20 +04:00
/*
KVM: Improve TSC offset matching
There are a few improvements that can be made to the TSC offset
matching code. First, we don't need to call the 128-bit multiply
(especially on a constant number), the code works much nicer to
do computation in nanosecond units.
Second, the way everything is setup with software TSC rate scaling,
we currently have per-cpu rates. Obviously this isn't too desirable
to use in practice, but if for some reason we do change the rate of
all VCPUs at runtime, then reset the TSCs, we will only want to
match offsets for VCPUs running at the same rate.
Finally, for the case where we have an unstable host TSC, but
rate scaling is being done in hardware, we should call the platform
code to compute the TSC offset, so the math is reorganized to recompute
the base instead, then transform the base into an offset using the
existing API.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
KVM: Fix 64-bit division in kvm_write_tsc()
Breaks i386 build.
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:51 +04:00
* Special case : TSC write with a small delta ( 1 second ) of virtual
* cycle time against real time is interpreted as an attempt to
* synchronize the CPU .
*
* For a reliable TSC , we can match TSC offsets , and for an unstable
* TSC , we add elapsed time in this computation . We could let the
* compensation code attempt to catch up if we fall behind , but
* it ' s better to try to match offsets from the beginning .
*/
2012-03-09 01:46:57 +04:00
if ( usdiff < USEC_PER_SEC & &
KVM: Improve TSC offset matching
There are a few improvements that can be made to the TSC offset
matching code. First, we don't need to call the 128-bit multiply
(especially on a constant number), the code works much nicer to
do computation in nanosecond units.
Second, the way everything is setup with software TSC rate scaling,
we currently have per-cpu rates. Obviously this isn't too desirable
to use in practice, but if for some reason we do change the rate of
all VCPUs at runtime, then reset the TSCs, we will only want to
match offsets for VCPUs running at the same rate.
Finally, for the case where we have an unstable host TSC, but
rate scaling is being done in hardware, we should call the platform
code to compute the TSC offset, so the math is reorganized to recompute
the base instead, then transform the base into an offset using the
existing API.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
KVM: Fix 64-bit division in kvm_write_tsc()
Breaks i386 build.
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:51 +04:00
vcpu - > arch . virtual_tsc_khz = = kvm - > arch . last_tsc_khz ) {
2010-08-20 12:07:20 +04:00
if ( ! check_tsc_unstable ( ) ) {
2012-02-03 21:43:57 +04:00
offset = kvm - > arch . cur_tsc_offset ;
2010-08-20 12:07:20 +04:00
pr_debug ( " kvm: matched tsc offset for %llu \n " , data ) ;
} else {
2011-03-25 11:44:50 +03:00
u64 delta = nsec_to_cycles ( vcpu , elapsed ) ;
KVM: Improve TSC offset matching
There are a few improvements that can be made to the TSC offset
matching code. First, we don't need to call the 128-bit multiply
(especially on a constant number), the code works much nicer to
do computation in nanosecond units.
Second, the way everything is setup with software TSC rate scaling,
we currently have per-cpu rates. Obviously this isn't too desirable
to use in practice, but if for some reason we do change the rate of
all VCPUs at runtime, then reset the TSCs, we will only want to
match offsets for VCPUs running at the same rate.
Finally, for the case where we have an unstable host TSC, but
rate scaling is being done in hardware, we should call the platform
code to compute the TSC offset, so the math is reorganized to recompute
the base instead, then transform the base into an offset using the
existing API.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
KVM: Fix 64-bit division in kvm_write_tsc()
Breaks i386 build.
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:51 +04:00
data + = delta ;
offset = kvm_x86_ops - > compute_tsc_offset ( vcpu , data ) ;
2010-08-20 12:07:25 +04:00
pr_debug ( " kvm: adjusted tsc offset by %llu \n " , delta ) ;
2010-08-20 12:07:20 +04:00
}
2012-02-03 21:43:57 +04:00
} else {
/*
* We split periods of matched TSC writes into generations .
* For each generation , we track the original measured
* nanosecond time , offset , and write , so if TSCs are in
* sync , we can match exact offset , and if not , we can match
* exact software computaion in compute_guest_tsc ( )
*
* These values are tracked in kvm - > arch . cur_xxx variables .
*/
kvm - > arch . cur_tsc_generation + + ;
kvm - > arch . cur_tsc_nsec = ns ;
kvm - > arch . cur_tsc_write = data ;
kvm - > arch . cur_tsc_offset = offset ;
pr_debug ( " kvm: new tsc generation %u, clock %llu \n " ,
kvm - > arch . cur_tsc_generation , data ) ;
2010-08-20 12:07:20 +04:00
}
2012-02-03 21:43:57 +04:00
/*
* We also track th most recent recorded KHZ , write and time to
* allow the matching interval to be extended at each write .
*/
2010-08-20 12:07:20 +04:00
kvm - > arch . last_tsc_nsec = ns ;
kvm - > arch . last_tsc_write = data ;
KVM: Improve TSC offset matching
There are a few improvements that can be made to the TSC offset
matching code. First, we don't need to call the 128-bit multiply
(especially on a constant number), the code works much nicer to
do computation in nanosecond units.
Second, the way everything is setup with software TSC rate scaling,
we currently have per-cpu rates. Obviously this isn't too desirable
to use in practice, but if for some reason we do change the rate of
all VCPUs at runtime, then reset the TSCs, we will only want to
match offsets for VCPUs running at the same rate.
Finally, for the case where we have an unstable host TSC, but
rate scaling is being done in hardware, we should call the platform
code to compute the TSC offset, so the math is reorganized to recompute
the base instead, then transform the base into an offset using the
existing API.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
KVM: Fix 64-bit division in kvm_write_tsc()
Breaks i386 build.
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:51 +04:00
kvm - > arch . last_tsc_khz = vcpu - > arch . virtual_tsc_khz ;
2010-08-20 12:07:17 +04:00
/* Reset of TSC must disable overshoot protection below */
vcpu - > arch . hv_clock . tsc_timestamp = 0 ;
KVM: Fix last_guest_tsc / tsc_offset semantics
The variable last_guest_tsc was being used as an ad-hoc indicator
that guest TSC has been initialized and recorded correctly. However,
it may not have been, it could be that guest TSC has been set to some
large value, the back to a small value (by, say, a software reboot).
This defeats the logic and causes KVM to falsely assume that the
guest TSC has gone backwards, marking the host TSC unstable, which
is undesirable behavior.
In addition, rather than try to compute an offset adjustment for the
TSC on unstable platforms, just recompute the whole offset. This
allows us to get rid of one callsite for adjust_tsc_offset, which
is problematic because the units it takes are in guest units, but
here, the computation was originally being done in host units.
Doing this, and also recording last_guest_tsc when the TSC is written
allow us to remove the tricky logic which depended on last_guest_tsc
being zero to indicate a reset of uninitialized value.
Instead, we now have the guarantee that the guest TSC offset is
always at least something which will get us last_guest_tsc.
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:53 +04:00
vcpu - > arch . last_guest_tsc = data ;
2012-02-03 21:43:57 +04:00
/* Keep track of which generation this VCPU has synchronized to */
vcpu - > arch . this_tsc_generation = kvm - > arch . cur_tsc_generation ;
vcpu - > arch . this_tsc_nsec = kvm - > arch . cur_tsc_nsec ;
vcpu - > arch . this_tsc_write = kvm - > arch . cur_tsc_write ;
kvm_x86_ops - > write_tsc_offset ( vcpu , offset ) ;
raw_spin_unlock_irqrestore ( & kvm - > arch . tsc_write_lock , flags ) ;
2010-08-20 12:07:17 +04:00
}
2012-02-03 21:43:57 +04:00
2010-08-20 12:07:17 +04:00
EXPORT_SYMBOL_GPL ( kvm_write_tsc ) ;
2010-09-19 04:38:14 +04:00
static int kvm_guest_time_update ( struct kvm_vcpu * v )
2008-02-15 22:52:47 +03:00
{
unsigned long flags ;
struct kvm_vcpu_arch * vcpu = & v - > arch ;
void * shared_kaddr ;
2009-04-12 16:49:07 +04:00
unsigned long this_tsc_khz ;
2010-08-20 12:07:30 +04:00
s64 kernel_ns , max_kernel_ns ;
u64 tsc_timestamp ;
2008-02-15 22:52:47 +03:00
/* Keep irq disabled to prevent changes to the clock */
local_irq_save ( flags ) ;
2011-08-02 16:54:20 +04:00
tsc_timestamp = kvm_x86_ops - > read_l1_tsc ( v ) ;
2010-08-20 12:07:25 +04:00
kernel_ns = get_kernel_ns ( ) ;
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
this_tsc_khz = __get_cpu_var ( cpu_tsc_khz ) ;
2010-08-20 12:07:21 +04:00
if ( unlikely ( this_tsc_khz = = 0 ) ) {
2010-09-19 04:38:15 +04:00
local_irq_restore ( flags ) ;
2010-09-19 04:38:14 +04:00
kvm_make_request ( KVM_REQ_CLOCK_UPDATE , v ) ;
2010-08-20 12:07:21 +04:00
return 1 ;
}
2008-02-15 22:52:47 +03:00
2010-09-19 04:38:15 +04:00
/*
* We may have to catch up the TSC to match elapsed wall clock
* time for two reasons , even if kvmclock is used .
* 1 ) CPU could have been running below the maximum TSC rate
* 2 ) Broken TSC compensation resets the base at each VCPU
* entry to avoid unknown leaps of TSC even when running
* again on the same CPU . This may cause apparent elapsed
* time to disappear , and the guest to stand still or run
* very slowly .
*/
if ( vcpu - > tsc_catchup ) {
u64 tsc = compute_guest_tsc ( v , kernel_ns ) ;
if ( tsc > tsc_timestamp ) {
2012-02-03 21:43:55 +04:00
adjust_tsc_offset_guest ( v , tsc - tsc_timestamp ) ;
2010-09-19 04:38:15 +04:00
tsc_timestamp = tsc ;
}
2008-06-03 18:17:31 +04:00
}
2008-02-15 22:52:47 +03:00
local_irq_restore ( flags ) ;
2010-09-19 04:38:15 +04:00
if ( ! vcpu - > time_page )
return 0 ;
2008-02-15 22:52:47 +03:00
2010-08-20 12:07:30 +04:00
/*
* Time as measured by the TSC may go backwards when resetting the base
* tsc_timestamp . The reason for this is that the TSC resolution is
* higher than the resolution of the other clock scales . Thus , many
* possible measurments of the TSC correspond to one measurement of any
* other clock , and so a spread of values is possible . This is not a
* problem for the computation of the nanosecond clock ; with TSC rates
* around 1 GHZ , there can only be a few cycles which correspond to one
* nanosecond value , and any path through this code will inevitably
* take longer than that . However , with the kernel_ns value itself ,
* the precision may be much lower , down to HZ granularity . If the
* first sampling of TSC against kernel_ns ends in the low part of the
* range , and the second in the high end of the range , we can get :
*
* ( TSC - offset_low ) * S + kns_old > ( TSC - offset_high ) * S + kns_new
*
* As the sampling errors potentially range in the thousands of cycles ,
* it is possible such a time value has already been observed by the
* guest . To protect against this , we must compute the system time as
* observed by the guest and ensure the new system time is greater .
*/
max_kernel_ns = 0 ;
KVM: Fix last_guest_tsc / tsc_offset semantics
The variable last_guest_tsc was being used as an ad-hoc indicator
that guest TSC has been initialized and recorded correctly. However,
it may not have been, it could be that guest TSC has been set to some
large value, the back to a small value (by, say, a software reboot).
This defeats the logic and causes KVM to falsely assume that the
guest TSC has gone backwards, marking the host TSC unstable, which
is undesirable behavior.
In addition, rather than try to compute an offset adjustment for the
TSC on unstable platforms, just recompute the whole offset. This
allows us to get rid of one callsite for adjust_tsc_offset, which
is problematic because the units it takes are in guest units, but
here, the computation was originally being done in host units.
Doing this, and also recording last_guest_tsc when the TSC is written
allow us to remove the tricky logic which depended on last_guest_tsc
being zero to indicate a reset of uninitialized value.
Instead, we now have the guarantee that the guest TSC offset is
always at least something which will get us last_guest_tsc.
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:53 +04:00
if ( vcpu - > hv_clock . tsc_timestamp ) {
2010-08-20 12:07:30 +04:00
max_kernel_ns = vcpu - > last_guest_tsc -
vcpu - > hv_clock . tsc_timestamp ;
max_kernel_ns = pvclock_scale_delta ( max_kernel_ns ,
vcpu - > hv_clock . tsc_to_system_mul ,
vcpu - > hv_clock . tsc_shift ) ;
max_kernel_ns + = vcpu - > last_kernel_ns ;
}
2009-10-16 23:28:36 +04:00
2010-08-20 12:07:23 +04:00
if ( unlikely ( vcpu - > hw_tsc_khz ! = this_tsc_khz ) ) {
2010-09-19 04:38:13 +04:00
kvm_get_time_scale ( NSEC_PER_SEC / 1000 , this_tsc_khz ,
& vcpu - > hv_clock . tsc_shift ,
& vcpu - > hv_clock . tsc_to_system_mul ) ;
2010-08-20 12:07:23 +04:00
vcpu - > hw_tsc_khz = this_tsc_khz ;
2010-08-20 12:07:21 +04:00
}
2010-08-20 12:07:30 +04:00
if ( max_kernel_ns > kernel_ns )
kernel_ns = max_kernel_ns ;
2010-08-20 12:07:21 +04:00
/* With all the info we got, fill in the values */
2010-08-20 12:07:30 +04:00
vcpu - > hv_clock . tsc_timestamp = tsc_timestamp ;
2010-08-20 12:07:25 +04:00
vcpu - > hv_clock . system_time = kernel_ns + v - > kvm - > arch . kvmclock_offset ;
2010-08-20 12:07:30 +04:00
vcpu - > last_kernel_ns = kernel_ns ;
2010-09-19 04:38:12 +04:00
vcpu - > last_guest_tsc = tsc_timestamp ;
2010-05-11 20:17:46 +04:00
vcpu - > hv_clock . flags = 0 ;
2008-02-15 22:52:47 +03:00
/*
* The interface expects us to write an even number signaling that the
* update is finished . Since the guest won ' t see the intermediate
2008-06-03 18:17:31 +04:00
* state , we just increase by 2 at the end .
2008-02-15 22:52:47 +03:00
*/
2008-06-03 18:17:31 +04:00
vcpu - > hv_clock . version + = 2 ;
2008-02-15 22:52:47 +03:00
2011-11-25 19:14:17 +04:00
shared_kaddr = kmap_atomic ( vcpu - > time_page ) ;
2008-02-15 22:52:47 +03:00
memcpy ( shared_kaddr + vcpu - > time_offset , & vcpu - > hv_clock ,
2008-06-03 18:17:31 +04:00
sizeof ( vcpu - > hv_clock ) ) ;
2008-02-15 22:52:47 +03:00
2011-11-25 19:14:17 +04:00
kunmap_atomic ( shared_kaddr ) ;
2008-02-15 22:52:47 +03:00
mark_page_dirty ( v - > kvm , vcpu - > time > > PAGE_SHIFT ) ;
2010-08-20 12:07:21 +04:00
return 0 ;
2009-02-04 19:52:04 +03:00
}
2008-05-26 21:06:35 +04:00
static bool msr_mtrr_valid ( unsigned msr )
{
switch ( msr ) {
case 0x200 . . . 0x200 + 2 * KVM_NR_VAR_MTRR - 1 :
case MSR_MTRRfix64K_00000 :
case MSR_MTRRfix16K_80000 :
case MSR_MTRRfix16K_A0000 :
case MSR_MTRRfix4K_C0000 :
case MSR_MTRRfix4K_C8000 :
case MSR_MTRRfix4K_D0000 :
case MSR_MTRRfix4K_D8000 :
case MSR_MTRRfix4K_E0000 :
case MSR_MTRRfix4K_E8000 :
case MSR_MTRRfix4K_F0000 :
case MSR_MTRRfix4K_F8000 :
case MSR_MTRRdefType :
case MSR_IA32_CR_PAT :
return true ;
case 0x2f8 :
return true ;
}
return false ;
}
2009-06-22 22:27:56 +04:00
static bool valid_pat_type ( unsigned t )
{
return t < 8 & & ( 1 < < t ) & 0xf3 ; /* 0, 1, 4, 5, 6, 7 */
}
static bool valid_mtrr_type ( unsigned t )
{
return t < 8 & & ( 1 < < t ) & 0x73 ; /* 0, 1, 4, 5, 6 */
}
static bool mtrr_valid ( struct kvm_vcpu * vcpu , u32 msr , u64 data )
{
int i ;
if ( ! msr_mtrr_valid ( msr ) )
return false ;
if ( msr = = MSR_IA32_CR_PAT ) {
for ( i = 0 ; i < 8 ; i + + )
if ( ! valid_pat_type ( ( data > > ( i * 8 ) ) & 0xff ) )
return false ;
return true ;
} else if ( msr = = MSR_MTRRdefType ) {
if ( data & ~ 0xcff )
return false ;
return valid_mtrr_type ( data & 0xff ) ;
} else if ( msr > = MSR_MTRRfix64K_00000 & & msr < = MSR_MTRRfix4K_F8000 ) {
for ( i = 0 ; i < 8 ; i + + )
if ( ! valid_mtrr_type ( ( data > > ( i * 8 ) ) & 0xff ) )
return false ;
return true ;
}
/* variable MTRRs */
return valid_mtrr_type ( data & 0xff ) ;
}
2008-05-26 21:06:35 +04:00
static int set_msr_mtrr ( struct kvm_vcpu * vcpu , u32 msr , u64 data )
{
2008-10-09 12:01:54 +04:00
u64 * p = ( u64 * ) & vcpu - > arch . mtrr_state . fixed_ranges ;
2009-06-22 22:27:56 +04:00
if ( ! mtrr_valid ( vcpu , msr , data ) )
2008-05-26 21:06:35 +04:00
return 1 ;
2008-10-09 12:01:54 +04:00
if ( msr = = MSR_MTRRdefType ) {
vcpu - > arch . mtrr_state . def_type = data ;
vcpu - > arch . mtrr_state . enabled = ( data & 0xc00 ) > > 10 ;
} else if ( msr = = MSR_MTRRfix64K_00000 )
p [ 0 ] = data ;
else if ( msr = = MSR_MTRRfix16K_80000 | | msr = = MSR_MTRRfix16K_A0000 )
p [ 1 + msr - MSR_MTRRfix16K_80000 ] = data ;
else if ( msr > = MSR_MTRRfix4K_C0000 & & msr < = MSR_MTRRfix4K_F8000 )
p [ 3 + msr - MSR_MTRRfix4K_C0000 ] = data ;
else if ( msr = = MSR_IA32_CR_PAT )
vcpu - > arch . pat = data ;
else { /* Variable MTRRs */
int idx , is_mtrr_mask ;
u64 * pt ;
idx = ( msr - 0x200 ) / 2 ;
is_mtrr_mask = msr - 0x200 - 2 * idx ;
if ( ! is_mtrr_mask )
pt =
( u64 * ) & vcpu - > arch . mtrr_state . var_ranges [ idx ] . base_lo ;
else
pt =
( u64 * ) & vcpu - > arch . mtrr_state . var_ranges [ idx ] . mask_lo ;
* pt = data ;
}
kvm_mmu_reset_context ( vcpu ) ;
2008-05-26 21:06:35 +04:00
return 0 ;
}
2007-10-30 20:44:17 +03:00
2009-05-11 12:48:15 +04:00
static int set_msr_mce ( struct kvm_vcpu * vcpu , u32 msr , u64 data )
2007-10-30 20:44:17 +03:00
{
2009-05-11 12:48:15 +04:00
u64 mcg_cap = vcpu - > arch . mcg_cap ;
unsigned bank_num = mcg_cap & 0xff ;
2007-10-30 20:44:17 +03:00
switch ( msr ) {
case MSR_IA32_MCG_STATUS :
2009-05-11 12:48:15 +04:00
vcpu - > arch . mcg_status = data ;
2007-10-30 20:44:17 +03:00
break ;
2008-02-11 22:28:27 +03:00
case MSR_IA32_MCG_CTL :
2009-05-11 12:48:15 +04:00
if ( ! ( mcg_cap & MCG_CTL_P ) )
return 1 ;
if ( data ! = 0 & & data ! = ~ ( u64 ) 0 )
return - 1 ;
vcpu - > arch . mcg_ctl = data ;
break ;
default :
if ( msr > = MSR_IA32_MC0_CTL & &
msr < MSR_IA32_MC0_CTL + 4 * bank_num ) {
u32 offset = msr - MSR_IA32_MC0_CTL ;
2010-03-24 19:46:42 +03:00
/* only 0 or all 1s can be written to IA32_MCi_CTL
* some Linux kernels though clear bit 10 in bank 4 to
* workaround a BIOS / GART TBL issue on AMD K8s , ignore
* this to avoid an uncatched # GP in the guest
*/
2009-05-11 12:48:15 +04:00
if ( ( offset & 0x3 ) = = 0 & &
2010-03-24 19:46:42 +03:00
data ! = 0 & & ( data | ( 1 < < 10 ) ) ! = ~ ( u64 ) 0 )
2009-05-11 12:48:15 +04:00
return - 1 ;
vcpu - > arch . mce_banks [ offset ] = data ;
break ;
}
return 1 ;
}
return 0 ;
}
2009-10-16 02:21:43 +04:00
static int xen_hvm_config ( struct kvm_vcpu * vcpu , u64 data )
{
struct kvm * kvm = vcpu - > kvm ;
int lm = is_long_mode ( vcpu ) ;
u8 * blob_addr = lm ? ( u8 * ) ( long ) kvm - > arch . xen_hvm_config . blob_addr_64
: ( u8 * ) ( long ) kvm - > arch . xen_hvm_config . blob_addr_32 ;
u8 blob_size = lm ? kvm - > arch . xen_hvm_config . blob_size_64
: kvm - > arch . xen_hvm_config . blob_size_32 ;
u32 page_num = data & ~ PAGE_MASK ;
u64 page_addr = data & PAGE_MASK ;
u8 * page ;
int r ;
r = - E2BIG ;
if ( page_num > = blob_size )
goto out ;
r = - ENOMEM ;
2011-12-04 21:36:29 +04:00
page = memdup_user ( blob_addr + ( page_num * PAGE_SIZE ) , PAGE_SIZE ) ;
if ( IS_ERR ( page ) ) {
r = PTR_ERR ( page ) ;
2009-10-16 02:21:43 +04:00
goto out ;
2011-12-04 21:36:29 +04:00
}
2009-10-16 02:21:43 +04:00
if ( kvm_write_guest ( kvm , page_addr , page , PAGE_SIZE ) )
goto out_free ;
r = 0 ;
out_free :
kfree ( page ) ;
out :
return r ;
}
2010-01-17 16:51:22 +03:00
static bool kvm_hv_hypercall_enabled ( struct kvm * kvm )
{
return kvm - > arch . hv_hypercall & HV_X64_MSR_HYPERCALL_ENABLE ;
}
static bool kvm_hv_msr_partition_wide ( u32 msr )
{
bool r = false ;
switch ( msr ) {
case HV_X64_MSR_GUEST_OS_ID :
case HV_X64_MSR_HYPERCALL :
r = true ;
break ;
}
return r ;
}
static int set_msr_hyperv_pw ( struct kvm_vcpu * vcpu , u32 msr , u64 data )
{
struct kvm * kvm = vcpu - > kvm ;
switch ( msr ) {
case HV_X64_MSR_GUEST_OS_ID :
kvm - > arch . hv_guest_os_id = data ;
/* setting guest os id to zero disables hypercall page */
if ( ! kvm - > arch . hv_guest_os_id )
kvm - > arch . hv_hypercall & = ~ HV_X64_MSR_HYPERCALL_ENABLE ;
break ;
case HV_X64_MSR_HYPERCALL : {
u64 gfn ;
unsigned long addr ;
u8 instructions [ 4 ] ;
/* if guest os id is not set hypercall should remain disabled */
if ( ! kvm - > arch . hv_guest_os_id )
break ;
if ( ! ( data & HV_X64_MSR_HYPERCALL_ENABLE ) ) {
kvm - > arch . hv_hypercall = data ;
break ;
}
gfn = data > > HV_X64_MSR_HYPERCALL_PAGE_ADDRESS_SHIFT ;
addr = gfn_to_hva ( kvm , gfn ) ;
if ( kvm_is_error_hva ( addr ) )
return 1 ;
kvm_x86_ops - > patch_hypercall ( vcpu , instructions ) ;
( ( unsigned char * ) instructions ) [ 3 ] = 0xc3 ; /* ret */
2011-05-15 19:22:04 +04:00
if ( __copy_to_user ( ( void __user * ) addr , instructions , 4 ) )
2010-01-17 16:51:22 +03:00
return 1 ;
kvm - > arch . hv_hypercall = data ;
break ;
}
default :
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " HYPER-V unimplemented wrmsr: 0x%x "
" data 0x%llx \n " , msr , data ) ;
2010-01-17 16:51:22 +03:00
return 1 ;
}
return 0 ;
}
static int set_msr_hyperv ( struct kvm_vcpu * vcpu , u32 msr , u64 data )
{
2010-01-17 16:51:23 +03:00
switch ( msr ) {
case HV_X64_MSR_APIC_ASSIST_PAGE : {
unsigned long addr ;
2010-01-17 16:51:22 +03:00
2010-01-17 16:51:23 +03:00
if ( ! ( data & HV_X64_MSR_APIC_ASSIST_PAGE_ENABLE ) ) {
vcpu - > arch . hv_vapic = data ;
break ;
}
addr = gfn_to_hva ( vcpu - > kvm , data > >
HV_X64_MSR_APIC_ASSIST_PAGE_ADDRESS_SHIFT ) ;
if ( kvm_is_error_hva ( addr ) )
return 1 ;
2011-05-15 19:22:04 +04:00
if ( __clear_user ( ( void __user * ) addr , PAGE_SIZE ) )
2010-01-17 16:51:23 +03:00
return 1 ;
vcpu - > arch . hv_vapic = data ;
break ;
}
case HV_X64_MSR_EOI :
return kvm_hv_vapic_msr_write ( vcpu , APIC_EOI , data ) ;
case HV_X64_MSR_ICR :
return kvm_hv_vapic_msr_write ( vcpu , APIC_ICR , data ) ;
case HV_X64_MSR_TPR :
return kvm_hv_vapic_msr_write ( vcpu , APIC_TASKPRI , data ) ;
default :
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " HYPER-V unimplemented wrmsr: 0x%x "
" data 0x%llx \n " , msr , data ) ;
2010-01-17 16:51:23 +03:00
return 1 ;
}
return 0 ;
2010-01-17 16:51:22 +03:00
}
2010-10-14 13:22:50 +04:00
static int kvm_pv_enable_async_pf ( struct kvm_vcpu * vcpu , u64 data )
{
gpa_t gpa = data & ~ 0x3f ;
2010-10-14 13:22:55 +04:00
/* Bits 2:5 are resrved, Should be zero */
if ( data & 0x3c )
2010-10-14 13:22:50 +04:00
return 1 ;
vcpu - > arch . apf . msr_val = data ;
if ( ! ( data & KVM_ASYNC_PF_ENABLED ) ) {
kvm_clear_async_pf_completion_queue ( vcpu ) ;
kvm_async_pf_hash_reset ( vcpu ) ;
return 0 ;
}
if ( kvm_gfn_to_hva_cache_init ( vcpu - > kvm , & vcpu - > arch . apf . data , gpa ) )
return 1 ;
2010-10-14 13:22:55 +04:00
vcpu - > arch . apf . send_user_only = ! ( data & KVM_ASYNC_PF_SEND_ALWAYS ) ;
2010-10-14 13:22:50 +04:00
kvm_async_pf_wakeup_all ( vcpu ) ;
return 0 ;
}
2011-02-01 22:16:40 +03:00
static void kvmclock_reset ( struct kvm_vcpu * vcpu )
{
if ( vcpu - > arch . time_page ) {
kvm_release_page_dirty ( vcpu - > arch . time_page ) ;
vcpu - > arch . time_page = NULL ;
}
}
2011-07-11 23:28:14 +04:00
static void accumulate_steal_time ( struct kvm_vcpu * vcpu )
{
u64 delta ;
if ( ! ( vcpu - > arch . st . msr_val & KVM_MSR_ENABLED ) )
return ;
delta = current - > sched_info . run_delay - vcpu - > arch . st . last_steal ;
vcpu - > arch . st . last_steal = current - > sched_info . run_delay ;
vcpu - > arch . st . accum_steal = delta ;
}
static void record_steal_time ( struct kvm_vcpu * vcpu )
{
if ( ! ( vcpu - > arch . st . msr_val & KVM_MSR_ENABLED ) )
return ;
if ( unlikely ( kvm_read_guest_cached ( vcpu - > kvm , & vcpu - > arch . st . stime ,
& vcpu - > arch . st . steal , sizeof ( struct kvm_steal_time ) ) ) )
return ;
vcpu - > arch . st . steal . steal + = vcpu - > arch . st . accum_steal ;
vcpu - > arch . st . steal . version + = 2 ;
vcpu - > arch . st . accum_steal = 0 ;
kvm_write_guest_cached ( vcpu - > kvm , & vcpu - > arch . st . stime ,
& vcpu - > arch . st . steal , sizeof ( struct kvm_steal_time ) ) ;
}
2007-10-30 20:44:17 +03:00
int kvm_set_msr_common ( struct kvm_vcpu * vcpu , u32 msr , u64 data )
{
2012-01-15 16:17:22 +04:00
bool pr = false ;
2007-10-30 20:44:17 +03:00
switch ( msr ) {
case MSR_EFER :
2010-05-06 13:38:43 +04:00
return set_efer ( vcpu , data ) ;
2009-06-24 14:44:33 +04:00
case MSR_K7_HWCR :
data & = ~ ( u64 ) 0x40 ; /* ignore flush filter disable */
2010-02-24 20:59:16 +03:00
data & = ~ ( u64 ) 0x100 ; /* ignore ignne emulation enable */
2012-02-22 01:44:21 +04:00
data & = ~ ( u64 ) 0x8 ; /* ignore TLB cache disable */
2009-06-24 14:44:33 +04:00
if ( data ! = 0 ) {
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " unimplemented HWCR wrmsr: 0x%llx \n " ,
data ) ;
2009-06-24 14:44:33 +04:00
return 1 ;
}
2007-10-30 20:44:17 +03:00
break ;
2009-07-02 17:04:14 +04:00
case MSR_FAM10H_MMIO_CONF_BASE :
if ( data ! = 0 ) {
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " unimplemented MMIO_CONF_BASE wrmsr: "
" 0x%llx \n " , data ) ;
2009-07-02 17:04:14 +04:00
return 1 ;
}
2007-10-30 20:44:17 +03:00
break ;
2009-06-24 17:37:05 +04:00
case MSR_AMD64_NB_CFG :
2008-02-11 22:28:27 +03:00
break ;
2008-07-22 10:00:45 +04:00
case MSR_IA32_DEBUGCTLMSR :
if ( ! data ) {
/* We support the non-activated case already */
break ;
} else if ( data & ~ ( DEBUGCTLMSR_LBR | DEBUGCTLMSR_BTF ) ) {
/* Values other than LBR and BTF are vendor-specific,
thus reserved and should throw a # GP */
return 1 ;
}
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " %s: MSR_IA32_DEBUGCTLMSR 0x%llx, nop \n " ,
__func__ , data ) ;
2008-07-22 10:00:45 +04:00
break ;
2007-10-30 20:44:17 +03:00
case MSR_IA32_UCODE_REV :
case MSR_IA32_UCODE_WRITE :
2008-12-29 18:32:28 +03:00
case MSR_VM_HSAVE_PA :
2009-07-03 18:00:14 +04:00
case MSR_AMD64_PATCH_LOADER :
2007-10-30 20:44:17 +03:00
break ;
2008-05-26 21:06:35 +04:00
case 0x200 . . . 0x2ff :
return set_msr_mtrr ( vcpu , msr , data ) ;
2007-10-30 20:44:17 +03:00
case MSR_IA32_APICBASE :
kvm_set_apic_base ( vcpu , data ) ;
break ;
2009-07-05 18:39:36 +04:00
case APIC_BASE_MSR . . . APIC_BASE_MSR + 0x3ff :
return kvm_x2apic_msr_write ( vcpu , msr , data ) ;
2011-09-22 12:55:52 +04:00
case MSR_IA32_TSCDEADLINE :
kvm_set_lapic_tscdeadline_msr ( vcpu , data ) ;
break ;
2007-10-30 20:44:17 +03:00
case MSR_IA32_MISC_ENABLE :
2007-12-13 18:50:52 +03:00
vcpu - > arch . ia32_misc_enable_msr = data ;
2007-10-30 20:44:17 +03:00
break ;
2010-05-11 20:17:41 +04:00
case MSR_KVM_WALL_CLOCK_NEW :
2008-02-15 22:52:47 +03:00
case MSR_KVM_WALL_CLOCK :
vcpu - > kvm - > arch . wall_clock = data ;
kvm_write_wall_clock ( vcpu - > kvm , data ) ;
break ;
2010-05-11 20:17:41 +04:00
case MSR_KVM_SYSTEM_TIME_NEW :
2008-02-15 22:52:47 +03:00
case MSR_KVM_SYSTEM_TIME : {
2011-02-01 22:16:40 +03:00
kvmclock_reset ( vcpu ) ;
2008-02-15 22:52:47 +03:00
vcpu - > arch . time = data ;
2010-09-19 04:38:15 +04:00
kvm_make_request ( KVM_REQ_CLOCK_UPDATE , vcpu ) ;
2008-02-15 22:52:47 +03:00
/* we verify if the enable bit is set... */
if ( ! ( data & 1 ) )
break ;
/* ...but clean it before doing the actual write */
vcpu - > arch . time_offset = data & ~ ( PAGE_MASK | 1 ) ;
vcpu - > arch . time_page =
gfn_to_page ( vcpu - > kvm , data > > PAGE_SHIFT ) ;
if ( is_error_page ( vcpu - > arch . time_page ) ) {
kvm_release_page_clean ( vcpu - > arch . time_page ) ;
vcpu - > arch . time_page = NULL ;
}
break ;
}
2010-10-14 13:22:50 +04:00
case MSR_KVM_ASYNC_PF_EN :
if ( kvm_pv_enable_async_pf ( vcpu , data ) )
return 1 ;
break ;
2011-07-11 23:28:14 +04:00
case MSR_KVM_STEAL_TIME :
if ( unlikely ( ! sched_info_on ( ) ) )
return 1 ;
if ( data & KVM_STEAL_RESERVED_MASK )
return 1 ;
if ( kvm_gfn_to_hva_cache_init ( vcpu - > kvm , & vcpu - > arch . st . stime ,
data & KVM_STEAL_VALID_BITS ) )
return 1 ;
vcpu - > arch . st . msr_val = data ;
if ( ! ( data & KVM_MSR_ENABLED ) )
break ;
vcpu - > arch . st . last_steal = current - > sched_info . run_delay ;
preempt_disable ( ) ;
accumulate_steal_time ( vcpu ) ;
preempt_enable ( ) ;
kvm_make_request ( KVM_REQ_STEAL_UPDATE , vcpu ) ;
break ;
2012-06-24 20:25:07 +04:00
case MSR_KVM_PV_EOI_EN :
if ( kvm_lapic_enable_pv_eoi ( vcpu , data ) )
return 1 ;
break ;
2011-07-11 23:28:14 +04:00
2009-05-11 12:48:15 +04:00
case MSR_IA32_MCG_CTL :
case MSR_IA32_MCG_STATUS :
case MSR_IA32_MC0_CTL . . . MSR_IA32_MC0_CTL + 4 * KVM_MAX_MCE_BANKS - 1 :
return set_msr_mce ( vcpu , msr , data ) ;
2009-06-13 00:01:29 +04:00
/* Performance counters are not protected by a CPUID bit,
* so we should check all of them in the generic path for the sake of
* cross vendor migration .
* Writing a zero into the event select MSRs disables them ,
* which we perfectly emulate ; - ) . Any other value should be at least
* reported , some guests depend on them .
*/
case MSR_K7_EVNTSEL0 :
case MSR_K7_EVNTSEL1 :
case MSR_K7_EVNTSEL2 :
case MSR_K7_EVNTSEL3 :
if ( data ! = 0 )
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " unimplemented perfctr wrmsr: "
" 0x%x data 0x%llx \n " , msr , data ) ;
2009-06-13 00:01:29 +04:00
break ;
/* at least RHEL 4 unconditionally writes to the perfctr registers,
* so we ignore writes to make it happy .
*/
case MSR_K7_PERFCTR0 :
case MSR_K7_PERFCTR1 :
case MSR_K7_PERFCTR2 :
case MSR_K7_PERFCTR3 :
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " unimplemented perfctr wrmsr: "
" 0x%x data 0x%llx \n " , msr , data ) ;
2009-06-13 00:01:29 +04:00
break ;
2012-01-15 16:17:22 +04:00
case MSR_P6_PERFCTR0 :
case MSR_P6_PERFCTR1 :
pr = true ;
case MSR_P6_EVNTSEL0 :
case MSR_P6_EVNTSEL1 :
if ( kvm_pmu_msr ( vcpu , msr ) )
return kvm_pmu_set_msr ( vcpu , msr , data ) ;
if ( pr | | data ! = 0 )
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " disabled perfctr wrmsr: "
" 0x%x data 0x%llx \n " , msr , data ) ;
2012-01-15 16:17:22 +04:00
break ;
2010-09-01 13:42:04 +04:00
case MSR_K7_CLK_CTL :
/*
* Ignore all writes to this no longer documented MSR .
* Writes are only relevant for old K7 processors ,
* all pre - dating SVM , but a recommended workaround from
* AMD for these chips . It is possible to speicify the
* affected processor models on the command line , hence
* the need to ignore the workaround .
*/
break ;
2010-01-17 16:51:22 +03:00
case HV_X64_MSR_GUEST_OS_ID . . . HV_X64_MSR_SINT15 :
if ( kvm_hv_msr_partition_wide ( msr ) ) {
int r ;
mutex_lock ( & vcpu - > kvm - > lock ) ;
r = set_msr_hyperv_pw ( vcpu , msr , data ) ;
mutex_unlock ( & vcpu - > kvm - > lock ) ;
return r ;
} else
return set_msr_hyperv ( vcpu , msr , data ) ;
break ;
2011-01-21 08:21:00 +03:00
case MSR_IA32_BBL_CR_CTL3 :
/* Drop writes to this legacy MSR -- see rdmsr
* counterpart for further detail .
*/
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " ignored wrmsr: 0x%x data %llx \n " , msr , data ) ;
2011-01-21 08:21:00 +03:00
break ;
2012-01-09 23:00:35 +04:00
case MSR_AMD64_OSVW_ID_LENGTH :
if ( ! guest_cpuid_has_osvw ( vcpu ) )
return 1 ;
vcpu - > arch . osvw . length = data ;
break ;
case MSR_AMD64_OSVW_STATUS :
if ( ! guest_cpuid_has_osvw ( vcpu ) )
return 1 ;
vcpu - > arch . osvw . status = data ;
break ;
2007-10-30 20:44:17 +03:00
default :
2009-10-16 02:21:43 +04:00
if ( msr & & ( msr = = vcpu - > kvm - > arch . xen_hvm_config . msr ) )
return xen_hvm_config ( vcpu , data ) ;
2011-11-10 16:57:22 +04:00
if ( kvm_pmu_msr ( vcpu , msr ) )
return kvm_pmu_set_msr ( vcpu , msr , data ) ;
2009-06-25 14:36:49 +04:00
if ( ! ignore_msrs ) {
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " unhandled wrmsr: 0x%x data %llx \n " ,
msr , data ) ;
2009-06-25 14:36:49 +04:00
return 1 ;
} else {
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " ignored wrmsr: 0x%x data %llx \n " ,
msr , data ) ;
2009-06-25 14:36:49 +04:00
break ;
}
2007-10-30 20:44:17 +03:00
}
return 0 ;
}
EXPORT_SYMBOL_GPL ( kvm_set_msr_common ) ;
/*
* Reads an msr value ( of ' msr_index ' ) into ' pdata ' .
* Returns 0 on success , non - 0 otherwise .
* Assumes vcpu_load ( ) was already called .
*/
int kvm_get_msr ( struct kvm_vcpu * vcpu , u32 msr_index , u64 * pdata )
{
return kvm_x86_ops - > get_msr ( vcpu , msr_index , pdata ) ;
}
2008-05-26 21:06:35 +04:00
static int get_msr_mtrr ( struct kvm_vcpu * vcpu , u32 msr , u64 * pdata )
{
2008-10-09 12:01:54 +04:00
u64 * p = ( u64 * ) & vcpu - > arch . mtrr_state . fixed_ranges ;
2008-05-26 21:06:35 +04:00
if ( ! msr_mtrr_valid ( msr ) )
return 1 ;
2008-10-09 12:01:54 +04:00
if ( msr = = MSR_MTRRdefType )
* pdata = vcpu - > arch . mtrr_state . def_type +
( vcpu - > arch . mtrr_state . enabled < < 10 ) ;
else if ( msr = = MSR_MTRRfix64K_00000 )
* pdata = p [ 0 ] ;
else if ( msr = = MSR_MTRRfix16K_80000 | | msr = = MSR_MTRRfix16K_A0000 )
* pdata = p [ 1 + msr - MSR_MTRRfix16K_80000 ] ;
else if ( msr > = MSR_MTRRfix4K_C0000 & & msr < = MSR_MTRRfix4K_F8000 )
* pdata = p [ 3 + msr - MSR_MTRRfix4K_C0000 ] ;
else if ( msr = = MSR_IA32_CR_PAT )
* pdata = vcpu - > arch . pat ;
else { /* Variable MTRRs */
int idx , is_mtrr_mask ;
u64 * pt ;
idx = ( msr - 0x200 ) / 2 ;
is_mtrr_mask = msr - 0x200 - 2 * idx ;
if ( ! is_mtrr_mask )
pt =
( u64 * ) & vcpu - > arch . mtrr_state . var_ranges [ idx ] . base_lo ;
else
pt =
( u64 * ) & vcpu - > arch . mtrr_state . var_ranges [ idx ] . mask_lo ;
* pdata = * pt ;
}
2008-05-26 21:06:35 +04:00
return 0 ;
}
2009-05-11 12:48:15 +04:00
static int get_msr_mce ( struct kvm_vcpu * vcpu , u32 msr , u64 * pdata )
2007-10-30 20:44:17 +03:00
{
u64 data ;
2009-05-11 12:48:15 +04:00
u64 mcg_cap = vcpu - > arch . mcg_cap ;
unsigned bank_num = mcg_cap & 0xff ;
2007-10-30 20:44:17 +03:00
switch ( msr ) {
case MSR_IA32_P5_MC_ADDR :
case MSR_IA32_P5_MC_TYPE :
2009-05-11 12:48:15 +04:00
data = 0 ;
break ;
2007-10-30 20:44:17 +03:00
case MSR_IA32_MCG_CAP :
2009-05-11 12:48:15 +04:00
data = vcpu - > arch . mcg_cap ;
break ;
2008-02-11 22:28:27 +03:00
case MSR_IA32_MCG_CTL :
2009-05-11 12:48:15 +04:00
if ( ! ( mcg_cap & MCG_CTL_P ) )
return 1 ;
data = vcpu - > arch . mcg_ctl ;
break ;
case MSR_IA32_MCG_STATUS :
data = vcpu - > arch . mcg_status ;
break ;
default :
if ( msr > = MSR_IA32_MC0_CTL & &
msr < MSR_IA32_MC0_CTL + 4 * bank_num ) {
u32 offset = msr - MSR_IA32_MC0_CTL ;
data = vcpu - > arch . mce_banks [ offset ] ;
break ;
}
return 1 ;
}
* pdata = data ;
return 0 ;
}
2010-01-17 16:51:22 +03:00
static int get_msr_hyperv_pw ( struct kvm_vcpu * vcpu , u32 msr , u64 * pdata )
{
u64 data = 0 ;
struct kvm * kvm = vcpu - > kvm ;
switch ( msr ) {
case HV_X64_MSR_GUEST_OS_ID :
data = kvm - > arch . hv_guest_os_id ;
break ;
case HV_X64_MSR_HYPERCALL :
data = kvm - > arch . hv_hypercall ;
break ;
default :
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " Hyper-V unhandled rdmsr: 0x%x \n " , msr ) ;
2010-01-17 16:51:22 +03:00
return 1 ;
}
* pdata = data ;
return 0 ;
}
static int get_msr_hyperv ( struct kvm_vcpu * vcpu , u32 msr , u64 * pdata )
{
u64 data = 0 ;
switch ( msr ) {
case HV_X64_MSR_VP_INDEX : {
int r ;
struct kvm_vcpu * v ;
kvm_for_each_vcpu ( r , v , vcpu - > kvm )
if ( v = = vcpu )
data = r ;
break ;
}
2010-01-17 16:51:23 +03:00
case HV_X64_MSR_EOI :
return kvm_hv_vapic_msr_read ( vcpu , APIC_EOI , pdata ) ;
case HV_X64_MSR_ICR :
return kvm_hv_vapic_msr_read ( vcpu , APIC_ICR , pdata ) ;
case HV_X64_MSR_TPR :
return kvm_hv_vapic_msr_read ( vcpu , APIC_TASKPRI , pdata ) ;
2011-07-22 02:38:10 +04:00
case HV_X64_MSR_APIC_ASSIST_PAGE :
2011-07-23 11:31:45 +04:00
data = vcpu - > arch . hv_vapic ;
break ;
2010-01-17 16:51:22 +03:00
default :
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " Hyper-V unhandled rdmsr: 0x%x \n " , msr ) ;
2010-01-17 16:51:22 +03:00
return 1 ;
}
* pdata = data ;
return 0 ;
}
2009-05-11 12:48:15 +04:00
int kvm_get_msr_common ( struct kvm_vcpu * vcpu , u32 msr , u64 * pdata )
{
u64 data ;
switch ( msr ) {
case MSR_IA32_PLATFORM_ID :
2007-10-30 20:44:17 +03:00
case MSR_IA32_EBL_CR_POWERON :
2008-07-22 10:00:45 +04:00
case MSR_IA32_DEBUGCTLMSR :
case MSR_IA32_LASTBRANCHFROMIP :
case MSR_IA32_LASTBRANCHTOIP :
case MSR_IA32_LASTINTFROMIP :
case MSR_IA32_LASTINTTOIP :
2009-05-14 09:30:10 +04:00
case MSR_K8_SYSCFG :
case MSR_K7_HWCR :
2008-12-29 18:32:28 +03:00
case MSR_VM_HSAVE_PA :
2009-06-15 11:55:34 +04:00
case MSR_K7_EVNTSEL0 :
2009-06-30 14:54:28 +04:00
case MSR_K7_PERFCTR0 :
2009-06-24 14:44:34 +04:00
case MSR_K8_INT_PENDING_MSG :
2009-06-24 17:37:05 +04:00
case MSR_AMD64_NB_CFG :
2009-07-02 17:04:14 +04:00
case MSR_FAM10H_MMIO_CONF_BASE :
2007-10-30 20:44:17 +03:00
data = 0 ;
break ;
2012-01-15 16:17:22 +04:00
case MSR_P6_PERFCTR0 :
case MSR_P6_PERFCTR1 :
case MSR_P6_EVNTSEL0 :
case MSR_P6_EVNTSEL1 :
if ( kvm_pmu_msr ( vcpu , msr ) )
return kvm_pmu_get_msr ( vcpu , msr , pdata ) ;
data = 0 ;
break ;
2011-07-30 02:44:21 +04:00
case MSR_IA32_UCODE_REV :
data = 0x100000000ULL ;
break ;
2008-05-26 21:06:35 +04:00
case MSR_MTRRcap :
data = 0x500 | KVM_NR_VAR_MTRR ;
break ;
case 0x200 . . . 0x2ff :
return get_msr_mtrr ( vcpu , msr , pdata ) ;
2007-10-30 20:44:17 +03:00
case 0xcd : /* fsb frequency */
data = 3 ;
break ;
2010-09-09 14:06:46 +04:00
/*
* MSR_EBC_FREQUENCY_ID
* Conservative value valid for even the basic CPU models .
* Models 0 , 1 : 000 in bits 23 : 21 indicating a bus speed of
* 100 MHz , model 2 000 in bits 18 : 16 indicating 100 MHz ,
* and 266 MHz for model 3 , or 4. Set Core Clock
* Frequency to System Bus Frequency Ratio to 1 ( bits
* 31 : 24 ) even though these are only valid for CPU
* models > 2 , however guests may end up dividing or
* multiplying by zero otherwise .
*/
case MSR_EBC_FREQUENCY_ID :
data = 1 < < 24 ;
break ;
2007-10-30 20:44:17 +03:00
case MSR_IA32_APICBASE :
data = kvm_get_apic_base ( vcpu ) ;
break ;
2009-07-05 18:39:36 +04:00
case APIC_BASE_MSR . . . APIC_BASE_MSR + 0x3ff :
return kvm_x2apic_msr_read ( vcpu , msr , pdata ) ;
break ;
2011-09-22 12:55:52 +04:00
case MSR_IA32_TSCDEADLINE :
data = kvm_get_lapic_tscdeadline_msr ( vcpu ) ;
break ;
2007-10-30 20:44:17 +03:00
case MSR_IA32_MISC_ENABLE :
2007-12-13 18:50:52 +03:00
data = vcpu - > arch . ia32_misc_enable_msr ;
2007-10-30 20:44:17 +03:00
break ;
2008-02-21 14:11:01 +03:00
case MSR_IA32_PERF_STATUS :
/* TSC increment by tick */
data = 1000ULL ;
/* CPU multiplier */
data | = ( ( ( uint64_t ) 4ULL ) < < 40 ) ;
break ;
2007-10-30 20:44:17 +03:00
case MSR_EFER :
2010-01-21 16:31:50 +03:00
data = vcpu - > arch . efer ;
2007-10-30 20:44:17 +03:00
break ;
2008-02-15 22:52:47 +03:00
case MSR_KVM_WALL_CLOCK :
2010-05-11 20:17:41 +04:00
case MSR_KVM_WALL_CLOCK_NEW :
2008-02-15 22:52:47 +03:00
data = vcpu - > kvm - > arch . wall_clock ;
break ;
case MSR_KVM_SYSTEM_TIME :
2010-05-11 20:17:41 +04:00
case MSR_KVM_SYSTEM_TIME_NEW :
2008-02-15 22:52:47 +03:00
data = vcpu - > arch . time ;
break ;
2010-10-14 13:22:50 +04:00
case MSR_KVM_ASYNC_PF_EN :
data = vcpu - > arch . apf . msr_val ;
break ;
2011-07-11 23:28:14 +04:00
case MSR_KVM_STEAL_TIME :
data = vcpu - > arch . st . msr_val ;
break ;
2012-08-26 19:00:29 +04:00
case MSR_KVM_PV_EOI_EN :
data = vcpu - > arch . pv_eoi . msr_val ;
break ;
2009-05-11 12:48:15 +04:00
case MSR_IA32_P5_MC_ADDR :
case MSR_IA32_P5_MC_TYPE :
case MSR_IA32_MCG_CAP :
case MSR_IA32_MCG_CTL :
case MSR_IA32_MCG_STATUS :
case MSR_IA32_MC0_CTL . . . MSR_IA32_MC0_CTL + 4 * KVM_MAX_MCE_BANKS - 1 :
return get_msr_mce ( vcpu , msr , pdata ) ;
2010-09-01 13:42:04 +04:00
case MSR_K7_CLK_CTL :
/*
* Provide expected ramp - up count for K7 . All other
* are set to zero , indicating minimum divisors for
* every field .
*
* This prevents guest kernels on AMD host with CPU
* type 6 , model 8 and higher from exploding due to
* the rdmsr failing .
*/
data = 0x20000000 ;
break ;
2010-01-17 16:51:22 +03:00
case HV_X64_MSR_GUEST_OS_ID . . . HV_X64_MSR_SINT15 :
if ( kvm_hv_msr_partition_wide ( msr ) ) {
int r ;
mutex_lock ( & vcpu - > kvm - > lock ) ;
r = get_msr_hyperv_pw ( vcpu , msr , pdata ) ;
mutex_unlock ( & vcpu - > kvm - > lock ) ;
return r ;
} else
return get_msr_hyperv ( vcpu , msr , pdata ) ;
break ;
2011-01-21 08:21:00 +03:00
case MSR_IA32_BBL_CR_CTL3 :
/* This legacy MSR exists but isn't fully documented in current
* silicon . It is however accessed by winxp in very narrow
* scenarios where it sets bit # 19 , itself documented as
* a " reserved " bit . Best effort attempt to source coherent
* read data here should the balance of the register be
* interpreted by the guest :
*
* L2 cache control register 3 : 64 GB range , 256 KB size ,
* enabled , latency 0x1 , configured
*/
data = 0xbe702111 ;
break ;
2012-01-09 23:00:35 +04:00
case MSR_AMD64_OSVW_ID_LENGTH :
if ( ! guest_cpuid_has_osvw ( vcpu ) )
return 1 ;
data = vcpu - > arch . osvw . length ;
break ;
case MSR_AMD64_OSVW_STATUS :
if ( ! guest_cpuid_has_osvw ( vcpu ) )
return 1 ;
data = vcpu - > arch . osvw . status ;
break ;
2007-10-30 20:44:17 +03:00
default :
2011-11-10 16:57:22 +04:00
if ( kvm_pmu_msr ( vcpu , msr ) )
return kvm_pmu_get_msr ( vcpu , msr , pdata ) ;
2009-06-25 14:36:49 +04:00
if ( ! ignore_msrs ) {
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " unhandled rdmsr: 0x%x \n " , msr ) ;
2009-06-25 14:36:49 +04:00
return 1 ;
} else {
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
vcpu_unimpl ( vcpu , " ignored rdmsr: 0x%x \n " , msr ) ;
2009-06-25 14:36:49 +04:00
data = 0 ;
}
break ;
2007-10-30 20:44:17 +03:00
}
* pdata = data ;
return 0 ;
}
EXPORT_SYMBOL_GPL ( kvm_get_msr_common ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
/*
* Read or write a bunch of msrs . All parameters are kernel addresses .
*
* @ return number of msrs set successfully .
*/
static int __msr_io ( struct kvm_vcpu * vcpu , struct kvm_msrs * msrs ,
struct kvm_msr_entry * entries ,
int ( * do_msr ) ( struct kvm_vcpu * vcpu ,
unsigned index , u64 * data ) )
{
2009-12-23 19:35:25 +03:00
int i , idx ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
2009-12-23 19:35:25 +03:00
idx = srcu_read_lock ( & vcpu - > kvm - > srcu ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
for ( i = 0 ; i < msrs - > nmsrs ; + + i )
if ( do_msr ( vcpu , entries [ i ] . index , & entries [ i ] . data ) )
break ;
2009-12-23 19:35:25 +03:00
srcu_read_unlock ( & vcpu - > kvm - > srcu , idx ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
return i ;
}
/*
* Read or write a bunch of msrs . Parameters are user addresses .
*
* @ return number of msrs set successfully .
*/
static int msr_io ( struct kvm_vcpu * vcpu , struct kvm_msrs __user * user_msrs ,
int ( * do_msr ) ( struct kvm_vcpu * vcpu ,
unsigned index , u64 * data ) ,
int writeback )
{
struct kvm_msrs msrs ;
struct kvm_msr_entry * entries ;
int r , n ;
unsigned size ;
r = - EFAULT ;
if ( copy_from_user ( & msrs , user_msrs , sizeof msrs ) )
goto out ;
r = - E2BIG ;
if ( msrs . nmsrs > = MAX_IO_MSRS )
goto out ;
size = sizeof ( struct kvm_msr_entry ) * msrs . nmsrs ;
2011-12-04 21:36:29 +04:00
entries = memdup_user ( user_msrs - > entries , size ) ;
if ( IS_ERR ( entries ) ) {
r = PTR_ERR ( entries ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
goto out ;
2011-12-04 21:36:29 +04:00
}
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
r = n = __msr_io ( vcpu , & msrs , entries , do_msr ) ;
if ( r < 0 )
goto out_free ;
r = - EFAULT ;
if ( writeback & & copy_to_user ( user_msrs - > entries , entries , size ) )
goto out_free ;
r = n ;
out_free :
2010-07-23 00:24:52 +04:00
kfree ( entries ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
out :
return r ;
}
2007-11-15 18:07:47 +03:00
int kvm_dev_ioctl_check_extension ( long ext )
{
int r ;
switch ( ext ) {
case KVM_CAP_IRQCHIP :
case KVM_CAP_HLT :
case KVM_CAP_MMU_SHADOW_CACHE_CONTROL :
case KVM_CAP_SET_TSS_ADDR :
2007-11-21 18:10:04 +03:00
case KVM_CAP_EXT_CPUID :
2009-02-04 19:52:04 +03:00
case KVM_CAP_CLOCKSOURCE :
2008-01-28 00:10:22 +03:00
case KVM_CAP_PIT :
2008-02-22 20:21:36 +03:00
case KVM_CAP_NOP_IO_DELAY :
2008-04-11 20:24:45 +04:00
case KVM_CAP_MP_STATE :
2008-07-29 12:30:57 +04:00
case KVM_CAP_SYNC_MMU :
2010-12-14 12:57:47 +03:00
case KVM_CAP_USER_NMI :
2008-12-30 20:55:06 +03:00
case KVM_CAP_REINJECT_CONTROL :
2009-02-04 18:28:14 +03:00
case KVM_CAP_IRQ_INJECT_STATUS :
2009-03-12 16:45:39 +03:00
case KVM_CAP_ASSIGN_DEV_IRQ :
2009-05-20 18:30:49 +04:00
case KVM_CAP_IRQFD :
KVM: add ioeventfd support
ioeventfd is a mechanism to register PIO/MMIO regions to trigger an eventfd
signal when written to by a guest. Host userspace can register any
arbitrary IO address with a corresponding eventfd and then pass the eventfd
to a specific end-point of interest for handling.
Normal IO requires a blocking round-trip since the operation may cause
side-effects in the emulated model or may return data to the caller.
Therefore, an IO in KVM traps from the guest to the host, causes a VMX/SVM
"heavy-weight" exit back to userspace, and is ultimately serviced by qemu's
device model synchronously before returning control back to the vcpu.
However, there is a subclass of IO which acts purely as a trigger for
other IO (such as to kick off an out-of-band DMA request, etc). For these
patterns, the synchronous call is particularly expensive since we really
only want to simply get our notification transmitted asychronously and
return as quickly as possible. All the sychronous infrastructure to ensure
proper data-dependencies are met in the normal IO case are just unecessary
overhead for signalling. This adds additional computational load on the
system, as well as latency to the signalling path.
Therefore, we provide a mechanism for registration of an in-kernel trigger
point that allows the VCPU to only require a very brief, lightweight
exit just long enough to signal an eventfd. This also means that any
clients compatible with the eventfd interface (which includes userspace
and kernelspace equally well) can now register to be notified. The end
result should be a more flexible and higher performance notification API
for the backend KVM hypervisor and perhipheral components.
To test this theory, we built a test-harness called "doorbell". This
module has a function called "doorbell_ring()" which simply increments a
counter for each time the doorbell is signaled. It supports signalling
from either an eventfd, or an ioctl().
We then wired up two paths to the doorbell: One via QEMU via a registered
io region and through the doorbell ioctl(). The other is direct via
ioeventfd.
You can download this test harness here:
ftp://ftp.novell.com/dev/ghaskins/doorbell.tar.bz2
The measured results are as follows:
qemu-mmio: 110000 iops, 9.09us rtt
ioeventfd-mmio: 200100 iops, 5.00us rtt
ioeventfd-pio: 367300 iops, 2.72us rtt
I didn't measure qemu-pio, because I have to figure out how to register a
PIO region with qemu's device model, and I got lazy. However, for now we
can extrapolate based on the data from the NULLIO runs of +2.56us for MMIO,
and -350ns for HC, we get:
qemu-pio: 153139 iops, 6.53us rtt
ioeventfd-hc: 412585 iops, 2.37us rtt
these are just for fun, for now, until I can gather more data.
Here is a graph for your convenience:
http://developer.novell.com/wiki/images/7/76/Iofd-chart.png
The conclusion to draw is that we save about 4us by skipping the userspace
hop.
--------------------
Signed-off-by: Gregory Haskins <ghaskins@novell.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2009-07-08 01:08:49 +04:00
case KVM_CAP_IOEVENTFD :
2009-05-15 00:42:53 +04:00
case KVM_CAP_PIT2 :
2009-07-07 19:50:38 +04:00
case KVM_CAP_PIT_STATE2 :
2009-07-21 06:42:48 +04:00
case KVM_CAP_SET_IDENTITY_MAP_ADDR :
2009-10-16 02:21:43 +04:00
case KVM_CAP_XEN_HVM :
2009-10-16 23:28:36 +04:00
case KVM_CAP_ADJUST_CLOCK :
2009-11-12 03:04:25 +03:00
case KVM_CAP_VCPU_EVENTS :
2010-01-17 16:51:22 +03:00
case KVM_CAP_HYPERV :
2010-01-17 16:51:23 +03:00
case KVM_CAP_HYPERV_VAPIC :
2010-01-17 16:51:24 +03:00
case KVM_CAP_HYPERV_SPIN :
2010-01-29 09:38:44 +03:00
case KVM_CAP_PCI_SEGMENT :
2010-02-15 12:45:43 +03:00
case KVM_CAP_DEBUGREGS :
2010-02-23 19:47:57 +03:00
case KVM_CAP_X86_ROBUST_SINGLESTEP :
2010-06-13 13:29:39 +04:00
case KVM_CAP_XSAVE :
2010-10-14 13:22:50 +04:00
case KVM_CAP_ASYNC_PF :
2011-03-25 11:44:51 +03:00
case KVM_CAP_GET_TSC_KHZ :
2012-02-28 17:19:54 +04:00
case KVM_CAP_PCI_2_3 :
2012-03-10 23:37:27 +04:00
case KVM_CAP_KVMCLOCK_CTRL :
2007-11-15 18:07:47 +03:00
r = 1 ;
break ;
2008-05-30 18:05:55 +04:00
case KVM_CAP_COALESCED_MMIO :
r = KVM_COALESCED_MMIO_PAGE_OFFSET ;
break ;
2007-12-26 14:57:04 +03:00
case KVM_CAP_VAPIC :
r = ! kvm_x86_ops - > cpu_has_accelerated_tpr ( ) ;
break ;
2008-02-20 12:53:16 +03:00
case KVM_CAP_NR_VCPUS :
2011-07-18 18:17:15 +04:00
r = KVM_SOFT_MAX_VCPUS ;
break ;
case KVM_CAP_MAX_VCPUS :
2008-02-20 12:53:16 +03:00
r = KVM_MAX_VCPUS ;
break ;
2008-02-20 12:59:20 +03:00
case KVM_CAP_NR_MEMSLOTS :
r = KVM_MEMORY_SLOTS ;
break ;
2009-10-02 02:28:39 +04:00
case KVM_CAP_PV_MMU : /* obsolete */
r = 0 ;
2008-02-22 20:21:37 +03:00
break ;
2008-09-14 04:48:28 +04:00
case KVM_CAP_IOMMU :
2011-09-06 20:46:34 +04:00
r = iommu_present ( & pci_bus_type ) ;
2008-09-14 04:48:28 +04:00
break ;
2009-05-11 12:48:15 +04:00
case KVM_CAP_MCE :
r = KVM_MAX_MCE_BANKS ;
break ;
2010-06-13 13:29:39 +04:00
case KVM_CAP_XCRS :
r = cpu_has_xsave ;
break ;
2011-03-25 11:44:51 +03:00
case KVM_CAP_TSC_CONTROL :
r = kvm_has_tsc_control ;
break ;
2011-12-21 15:28:29 +04:00
case KVM_CAP_TSC_DEADLINE_TIMER :
r = boot_cpu_has ( X86_FEATURE_TSC_DEADLINE_TIMER ) ;
break ;
2007-11-15 18:07:47 +03:00
default :
r = 0 ;
break ;
}
return r ;
}
2007-10-10 19:16:19 +04:00
long kvm_arch_dev_ioctl ( struct file * filp ,
unsigned int ioctl , unsigned long arg )
{
void __user * argp = ( void __user * ) arg ;
long r ;
switch ( ioctl ) {
case KVM_GET_MSR_INDEX_LIST : {
struct kvm_msr_list __user * user_msr_list = argp ;
struct kvm_msr_list msr_list ;
unsigned n ;
r = - EFAULT ;
if ( copy_from_user ( & msr_list , user_msr_list , sizeof msr_list ) )
goto out ;
n = msr_list . nmsrs ;
msr_list . nmsrs = num_msrs_to_save + ARRAY_SIZE ( emulated_msrs ) ;
if ( copy_to_user ( user_msr_list , & msr_list , sizeof msr_list ) )
goto out ;
r = - E2BIG ;
2009-07-02 23:45:47 +04:00
if ( n < msr_list . nmsrs )
2007-10-10 19:16:19 +04:00
goto out ;
r = - EFAULT ;
if ( copy_to_user ( user_msr_list - > indices , & msrs_to_save ,
num_msrs_to_save * sizeof ( u32 ) ) )
goto out ;
2009-07-02 23:45:47 +04:00
if ( copy_to_user ( user_msr_list - > indices + num_msrs_to_save ,
2007-10-10 19:16:19 +04:00
& emulated_msrs ,
ARRAY_SIZE ( emulated_msrs ) * sizeof ( u32 ) ) )
goto out ;
r = 0 ;
break ;
}
2008-02-11 19:37:23 +03:00
case KVM_GET_SUPPORTED_CPUID : {
struct kvm_cpuid2 __user * cpuid_arg = argp ;
struct kvm_cpuid2 cpuid ;
r = - EFAULT ;
if ( copy_from_user ( & cpuid , cpuid_arg , sizeof cpuid ) )
goto out ;
r = kvm_dev_ioctl_get_supported_cpuid ( & cpuid ,
2009-01-14 19:56:00 +03:00
cpuid_arg - > entries ) ;
2008-02-11 19:37:23 +03:00
if ( r )
goto out ;
r = - EFAULT ;
if ( copy_to_user ( cpuid_arg , & cpuid , sizeof cpuid ) )
goto out ;
r = 0 ;
break ;
}
2009-05-11 12:48:15 +04:00
case KVM_X86_GET_MCE_CAP_SUPPORTED : {
u64 mce_cap ;
mce_cap = KVM_MCE_CAP_SUPPORTED ;
r = - EFAULT ;
if ( copy_to_user ( argp , & mce_cap , sizeof mce_cap ) )
goto out ;
r = 0 ;
break ;
}
2007-10-10 19:16:19 +04:00
default :
r = - EINVAL ;
}
out :
return r ;
}
2010-06-30 08:25:15 +04:00
static void wbinvd_ipi ( void * garbage )
{
wbinvd ( ) ;
}
static bool need_emulate_wbinvd ( struct kvm_vcpu * vcpu )
{
return vcpu - > kvm - > arch . iommu_domain & &
! ( vcpu - > kvm - > arch . iommu_flags & KVM_IOMMU_CACHE_COHERENCY ) ;
}
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
void kvm_arch_vcpu_load ( struct kvm_vcpu * vcpu , int cpu )
{
2010-06-30 08:25:15 +04:00
/* Address WBINVD may be executed by guest */
if ( need_emulate_wbinvd ( vcpu ) ) {
if ( kvm_x86_ops - > has_wbinvd_exit ( ) )
cpumask_set_cpu ( cpu , vcpu - > arch . wbinvd_dirty_mask ) ;
else if ( vcpu - > cpu ! = - 1 & & vcpu - > cpu ! = cpu )
smp_call_function_single ( vcpu - > cpu ,
wbinvd_ipi , NULL , 1 ) ;
}
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
kvm_x86_ops - > vcpu_load ( vcpu , cpu ) ;
2011-03-25 11:44:48 +03:00
2012-02-03 21:43:56 +04:00
/* Apply any externally detected TSC adjustments (due to suspend) */
if ( unlikely ( vcpu - > arch . tsc_offset_adjustment ) ) {
adjust_tsc_offset_host ( vcpu , vcpu - > arch . tsc_offset_adjustment ) ;
vcpu - > arch . tsc_offset_adjustment = 0 ;
set_bit ( KVM_REQ_CLOCK_UPDATE , & vcpu - > requests ) ;
}
2011-03-25 11:44:48 +03:00
2010-08-20 12:07:24 +04:00
if ( unlikely ( vcpu - > cpu ! = cpu ) | | check_tsc_unstable ( ) ) {
2012-02-03 21:43:54 +04:00
s64 tsc_delta = ! vcpu - > arch . last_host_tsc ? 0 :
native_read_tsc ( ) - vcpu - > arch . last_host_tsc ;
2010-08-20 12:07:23 +04:00
if ( tsc_delta < 0 )
mark_tsc_unstable ( " KVM discovered backwards TSC " ) ;
2010-09-19 04:38:15 +04:00
if ( check_tsc_unstable ( ) ) {
KVM: Fix last_guest_tsc / tsc_offset semantics
The variable last_guest_tsc was being used as an ad-hoc indicator
that guest TSC has been initialized and recorded correctly. However,
it may not have been, it could be that guest TSC has been set to some
large value, the back to a small value (by, say, a software reboot).
This defeats the logic and causes KVM to falsely assume that the
guest TSC has gone backwards, marking the host TSC unstable, which
is undesirable behavior.
In addition, rather than try to compute an offset adjustment for the
TSC on unstable platforms, just recompute the whole offset. This
allows us to get rid of one callsite for adjust_tsc_offset, which
is problematic because the units it takes are in guest units, but
here, the computation was originally being done in host units.
Doing this, and also recording last_guest_tsc when the TSC is written
allow us to remove the tricky logic which depended on last_guest_tsc
being zero to indicate a reset of uninitialized value.
Instead, we now have the guarantee that the guest TSC offset is
always at least something which will get us last_guest_tsc.
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:53 +04:00
u64 offset = kvm_x86_ops - > compute_tsc_offset ( vcpu ,
vcpu - > arch . last_guest_tsc ) ;
kvm_x86_ops - > write_tsc_offset ( vcpu , offset ) ;
2010-09-19 04:38:15 +04:00
vcpu - > arch . tsc_catchup = 1 ;
}
2011-03-10 01:36:51 +03:00
kvm_make_request ( KVM_REQ_CLOCK_UPDATE , vcpu ) ;
2010-09-19 04:38:15 +04:00
if ( vcpu - > cpu ! = cpu )
kvm_migrate_timers ( vcpu ) ;
2010-08-20 12:07:23 +04:00
vcpu - > cpu = cpu ;
2009-10-10 06:26:08 +04:00
}
2011-07-11 23:28:14 +04:00
accumulate_steal_time ( vcpu ) ;
kvm_make_request ( KVM_REQ_STEAL_UPDATE , vcpu ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
}
void kvm_arch_vcpu_put ( struct kvm_vcpu * vcpu )
{
2009-12-30 13:40:26 +03:00
kvm_x86_ops - > vcpu_put ( vcpu ) ;
2010-05-03 17:05:44 +04:00
kvm_put_guest_fpu ( vcpu ) ;
2012-02-03 21:43:54 +04:00
vcpu - > arch . last_host_tsc = native_read_tsc ( ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
}
static int kvm_vcpu_ioctl_get_lapic ( struct kvm_vcpu * vcpu ,
struct kvm_lapic_state * s )
{
2007-12-13 18:50:52 +03:00
memcpy ( s - > regs , vcpu - > arch . apic - > regs , sizeof * s ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
return 0 ;
}
static int kvm_vcpu_ioctl_set_lapic ( struct kvm_vcpu * vcpu ,
struct kvm_lapic_state * s )
{
2007-12-13 18:50:52 +03:00
memcpy ( vcpu - > arch . apic - > regs , s - > regs , sizeof * s ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
kvm_apic_post_state_restore ( vcpu ) ;
2009-08-09 16:17:40 +04:00
update_cr8_intercept ( vcpu ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
return 0 ;
}
2007-11-20 23:36:41 +03:00
static int kvm_vcpu_ioctl_interrupt ( struct kvm_vcpu * vcpu ,
struct kvm_interrupt * irq )
{
if ( irq - > irq < 0 | | irq - > irq > = 256 )
return - EINVAL ;
if ( irqchip_in_kernel ( vcpu - > kvm ) )
return - ENXIO ;
2009-05-11 14:35:50 +04:00
kvm_queue_interrupt ( vcpu , irq - > irq , false ) ;
2010-07-27 13:30:24 +04:00
kvm_make_request ( KVM_REQ_EVENT , vcpu ) ;
2007-11-20 23:36:41 +03:00
return 0 ;
}
2008-09-26 11:30:55 +04:00
static int kvm_vcpu_ioctl_nmi ( struct kvm_vcpu * vcpu )
{
kvm_inject_nmi ( vcpu ) ;
return 0 ;
}
2007-10-22 18:50:39 +04:00
static int vcpu_ioctl_tpr_access_reporting ( struct kvm_vcpu * vcpu ,
struct kvm_tpr_access_ctl * tac )
{
if ( tac - > flags )
return - EINVAL ;
vcpu - > arch . tpr_access_reporting = ! ! tac - > enabled ;
return 0 ;
}
2009-05-11 12:48:15 +04:00
static int kvm_vcpu_ioctl_x86_setup_mce ( struct kvm_vcpu * vcpu ,
u64 mcg_cap )
{
int r ;
unsigned bank_num = mcg_cap & 0xff , bank ;
r = - EINVAL ;
2009-10-23 11:37:00 +04:00
if ( ! bank_num | | bank_num > = KVM_MAX_MCE_BANKS )
2009-05-11 12:48:15 +04:00
goto out ;
if ( mcg_cap & ~ ( KVM_MCE_CAP_SUPPORTED | 0xff | 0xff0000 ) )
goto out ;
r = 0 ;
vcpu - > arch . mcg_cap = mcg_cap ;
/* Init IA32_MCG_CTL to all 1s */
if ( mcg_cap & MCG_CTL_P )
vcpu - > arch . mcg_ctl = ~ ( u64 ) 0 ;
/* Init IA32_MCi_CTL to all 1s */
for ( bank = 0 ; bank < bank_num ; bank + + )
vcpu - > arch . mce_banks [ bank * 4 ] = ~ ( u64 ) 0 ;
out :
return r ;
}
static int kvm_vcpu_ioctl_x86_set_mce ( struct kvm_vcpu * vcpu ,
struct kvm_x86_mce * mce )
{
u64 mcg_cap = vcpu - > arch . mcg_cap ;
unsigned bank_num = mcg_cap & 0xff ;
u64 * banks = vcpu - > arch . mce_banks ;
if ( mce - > bank > = bank_num | | ! ( mce - > status & MCI_STATUS_VAL ) )
return - EINVAL ;
/*
* if IA32_MCG_CTL is not all 1 s , the uncorrected error
* reporting is disabled
*/
if ( ( mce - > status & MCI_STATUS_UC ) & & ( mcg_cap & MCG_CTL_P ) & &
vcpu - > arch . mcg_ctl ! = ~ ( u64 ) 0 )
return 0 ;
banks + = 4 * mce - > bank ;
/*
* if IA32_MCi_CTL is not all 1 s , the uncorrected error
* reporting is disabled for the bank
*/
if ( ( mce - > status & MCI_STATUS_UC ) & & banks [ 0 ] ! = ~ ( u64 ) 0 )
return 0 ;
if ( mce - > status & MCI_STATUS_UC ) {
if ( ( vcpu - > arch . mcg_status & MCG_STATUS_MCIP ) | |
2009-12-07 13:16:48 +03:00
! kvm_read_cr4_bits ( vcpu , X86_CR4_MCE ) ) {
2010-05-10 13:34:53 +04:00
kvm_make_request ( KVM_REQ_TRIPLE_FAULT , vcpu ) ;
2009-05-11 12:48:15 +04:00
return 0 ;
}
if ( banks [ 1 ] & MCI_STATUS_VAL )
mce - > status | = MCI_STATUS_OVER ;
banks [ 2 ] = mce - > addr ;
banks [ 3 ] = mce - > misc ;
vcpu - > arch . mcg_status = mce - > mcg_status ;
banks [ 1 ] = mce - > status ;
kvm_queue_exception ( vcpu , MC_VECTOR ) ;
} else if ( ! ( banks [ 1 ] & MCI_STATUS_VAL )
| | ! ( banks [ 1 ] & MCI_STATUS_UC ) ) {
if ( banks [ 1 ] & MCI_STATUS_VAL )
mce - > status | = MCI_STATUS_OVER ;
banks [ 2 ] = mce - > addr ;
banks [ 3 ] = mce - > misc ;
banks [ 1 ] = mce - > status ;
} else
banks [ 1 ] | = MCI_STATUS_OVER ;
return 0 ;
}
2009-11-12 03:04:25 +03:00
static void kvm_vcpu_ioctl_x86_get_vcpu_events ( struct kvm_vcpu * vcpu ,
struct kvm_vcpu_events * events )
{
2011-09-20 14:43:14 +04:00
process_nmi ( vcpu ) ;
2010-02-15 12:45:41 +03:00
events - > exception . injected =
vcpu - > arch . exception . pending & &
! kvm_exception_is_soft ( vcpu - > arch . exception . nr ) ;
2009-11-12 03:04:25 +03:00
events - > exception . nr = vcpu - > arch . exception . nr ;
events - > exception . has_error_code = vcpu - > arch . exception . has_error_code ;
2010-10-30 22:54:47 +04:00
events - > exception . pad = 0 ;
2009-11-12 03:04:25 +03:00
events - > exception . error_code = vcpu - > arch . exception . error_code ;
2010-02-15 12:45:41 +03:00
events - > interrupt . injected =
vcpu - > arch . interrupt . pending & & ! vcpu - > arch . interrupt . soft ;
2009-11-12 03:04:25 +03:00
events - > interrupt . nr = vcpu - > arch . interrupt . nr ;
2010-02-15 12:45:41 +03:00
events - > interrupt . soft = 0 ;
2010-02-19 21:38:07 +03:00
events - > interrupt . shadow =
kvm_x86_ops - > get_interrupt_shadow ( vcpu ,
KVM_X86_SHADOW_INT_MOV_SS | KVM_X86_SHADOW_INT_STI ) ;
2009-11-12 03:04:25 +03:00
events - > nmi . injected = vcpu - > arch . nmi_injected ;
2011-09-20 14:43:14 +04:00
events - > nmi . pending = vcpu - > arch . nmi_pending ! = 0 ;
2009-11-12 03:04:25 +03:00
events - > nmi . masked = kvm_x86_ops - > get_nmi_mask ( vcpu ) ;
2010-10-30 22:54:47 +04:00
events - > nmi . pad = 0 ;
2009-11-12 03:04:25 +03:00
events - > sipi_vector = vcpu - > arch . sipi_vector ;
2009-12-06 20:24:15 +03:00
events - > flags = ( KVM_VCPUEVENT_VALID_NMI_PENDING
2010-02-19 21:38:07 +03:00
| KVM_VCPUEVENT_VALID_SIPI_VECTOR
| KVM_VCPUEVENT_VALID_SHADOW ) ;
2010-10-30 22:54:47 +04:00
memset ( & events - > reserved , 0 , sizeof ( events - > reserved ) ) ;
2009-11-12 03:04:25 +03:00
}
static int kvm_vcpu_ioctl_x86_set_vcpu_events ( struct kvm_vcpu * vcpu ,
struct kvm_vcpu_events * events )
{
2009-12-06 20:24:15 +03:00
if ( events - > flags & ~ ( KVM_VCPUEVENT_VALID_NMI_PENDING
2010-02-19 21:38:07 +03:00
| KVM_VCPUEVENT_VALID_SIPI_VECTOR
| KVM_VCPUEVENT_VALID_SHADOW ) )
2009-11-12 03:04:25 +03:00
return - EINVAL ;
2011-09-20 14:43:14 +04:00
process_nmi ( vcpu ) ;
2009-11-12 03:04:25 +03:00
vcpu - > arch . exception . pending = events - > exception . injected ;
vcpu - > arch . exception . nr = events - > exception . nr ;
vcpu - > arch . exception . has_error_code = events - > exception . has_error_code ;
vcpu - > arch . exception . error_code = events - > exception . error_code ;
vcpu - > arch . interrupt . pending = events - > interrupt . injected ;
vcpu - > arch . interrupt . nr = events - > interrupt . nr ;
vcpu - > arch . interrupt . soft = events - > interrupt . soft ;
2010-02-19 21:38:07 +03:00
if ( events - > flags & KVM_VCPUEVENT_VALID_SHADOW )
kvm_x86_ops - > set_interrupt_shadow ( vcpu ,
events - > interrupt . shadow ) ;
2009-11-12 03:04:25 +03:00
vcpu - > arch . nmi_injected = events - > nmi . injected ;
2009-12-06 20:24:15 +03:00
if ( events - > flags & KVM_VCPUEVENT_VALID_NMI_PENDING )
vcpu - > arch . nmi_pending = events - > nmi . pending ;
2009-11-12 03:04:25 +03:00
kvm_x86_ops - > set_nmi_mask ( vcpu , events - > nmi . masked ) ;
2009-12-06 20:24:15 +03:00
if ( events - > flags & KVM_VCPUEVENT_VALID_SIPI_VECTOR )
vcpu - > arch . sipi_vector = events - > sipi_vector ;
2009-11-12 03:04:25 +03:00
2010-07-27 13:30:24 +04:00
kvm_make_request ( KVM_REQ_EVENT , vcpu ) ;
2009-11-12 03:04:25 +03:00
return 0 ;
}
2010-02-15 12:45:43 +03:00
static void kvm_vcpu_ioctl_x86_get_debugregs ( struct kvm_vcpu * vcpu ,
struct kvm_debugregs * dbgregs )
{
memcpy ( dbgregs - > db , vcpu - > arch . db , sizeof ( vcpu - > arch . db ) ) ;
dbgregs - > dr6 = vcpu - > arch . dr6 ;
dbgregs - > dr7 = vcpu - > arch . dr7 ;
dbgregs - > flags = 0 ;
2010-10-30 22:54:47 +04:00
memset ( & dbgregs - > reserved , 0 , sizeof ( dbgregs - > reserved ) ) ;
2010-02-15 12:45:43 +03:00
}
static int kvm_vcpu_ioctl_x86_set_debugregs ( struct kvm_vcpu * vcpu ,
struct kvm_debugregs * dbgregs )
{
if ( dbgregs - > flags )
return - EINVAL ;
memcpy ( vcpu - > arch . db , dbgregs - > db , sizeof ( vcpu - > arch . db ) ) ;
vcpu - > arch . dr6 = dbgregs - > dr6 ;
vcpu - > arch . dr7 = dbgregs - > dr7 ;
return 0 ;
}
2010-06-13 13:29:39 +04:00
static void kvm_vcpu_ioctl_x86_get_xsave ( struct kvm_vcpu * vcpu ,
struct kvm_xsave * guest_xsave )
{
if ( cpu_has_xsave )
memcpy ( guest_xsave - > region ,
& vcpu - > arch . guest_fpu . state - > xsave ,
2010-08-13 11:19:11 +04:00
xstate_size ) ;
2010-06-13 13:29:39 +04:00
else {
memcpy ( guest_xsave - > region ,
& vcpu - > arch . guest_fpu . state - > fxsave ,
sizeof ( struct i387_fxsave_struct ) ) ;
* ( u64 * ) & guest_xsave - > region [ XSAVE_HDR_OFFSET / sizeof ( u32 ) ] =
XSTATE_FPSSE ;
}
}
static int kvm_vcpu_ioctl_x86_set_xsave ( struct kvm_vcpu * vcpu ,
struct kvm_xsave * guest_xsave )
{
u64 xstate_bv =
* ( u64 * ) & guest_xsave - > region [ XSAVE_HDR_OFFSET / sizeof ( u32 ) ] ;
if ( cpu_has_xsave )
memcpy ( & vcpu - > arch . guest_fpu . state - > xsave ,
2010-08-13 11:19:11 +04:00
guest_xsave - > region , xstate_size ) ;
2010-06-13 13:29:39 +04:00
else {
if ( xstate_bv & ~ XSTATE_FPSSE )
return - EINVAL ;
memcpy ( & vcpu - > arch . guest_fpu . state - > fxsave ,
guest_xsave - > region , sizeof ( struct i387_fxsave_struct ) ) ;
}
return 0 ;
}
static void kvm_vcpu_ioctl_x86_get_xcrs ( struct kvm_vcpu * vcpu ,
struct kvm_xcrs * guest_xcrs )
{
if ( ! cpu_has_xsave ) {
guest_xcrs - > nr_xcrs = 0 ;
return ;
}
guest_xcrs - > nr_xcrs = 1 ;
guest_xcrs - > flags = 0 ;
guest_xcrs - > xcrs [ 0 ] . xcr = XCR_XFEATURE_ENABLED_MASK ;
guest_xcrs - > xcrs [ 0 ] . value = vcpu - > arch . xcr0 ;
}
static int kvm_vcpu_ioctl_x86_set_xcrs ( struct kvm_vcpu * vcpu ,
struct kvm_xcrs * guest_xcrs )
{
int i , r = 0 ;
if ( ! cpu_has_xsave )
return - EINVAL ;
if ( guest_xcrs - > nr_xcrs > KVM_MAX_XCRS | | guest_xcrs - > flags )
return - EINVAL ;
for ( i = 0 ; i < guest_xcrs - > nr_xcrs ; i + + )
/* Only support XCR0 currently */
if ( guest_xcrs - > xcrs [ 0 ] . xcr = = XCR_XFEATURE_ENABLED_MASK ) {
r = __kvm_set_xcr ( vcpu , XCR_XFEATURE_ENABLED_MASK ,
guest_xcrs - > xcrs [ 0 ] . value ) ;
break ;
}
if ( r )
r = - EINVAL ;
return r ;
}
2012-03-10 23:37:27 +04:00
/*
* kvm_set_guest_paused ( ) indicates to the guest kernel that it has been
* stopped by the hypervisor . This function will be called from the host only .
* EINVAL is returned when the host attempts to set the flag for a guest that
* does not support pv clocks .
*/
static int kvm_set_guest_paused ( struct kvm_vcpu * vcpu )
{
struct pvclock_vcpu_time_info * src = & vcpu - > arch . hv_clock ;
if ( ! vcpu - > arch . time_page )
return - EINVAL ;
src - > flags | = PVCLOCK_GUEST_STOPPED ;
mark_page_dirty ( vcpu - > kvm , vcpu - > arch . time > > PAGE_SHIFT ) ;
kvm_make_request ( KVM_REQ_CLOCK_UPDATE , vcpu ) ;
return 0 ;
}
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
long kvm_arch_vcpu_ioctl ( struct file * filp ,
unsigned int ioctl , unsigned long arg )
{
struct kvm_vcpu * vcpu = filp - > private_data ;
void __user * argp = ( void __user * ) arg ;
int r ;
2010-06-20 16:54:43 +04:00
union {
struct kvm_lapic_state * lapic ;
struct kvm_xsave * xsave ;
struct kvm_xcrs * xcrs ;
void * buffer ;
} u ;
u . buffer = NULL ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
switch ( ioctl ) {
case KVM_GET_LAPIC : {
2009-10-29 18:44:16 +03:00
r = - EINVAL ;
if ( ! vcpu - > arch . apic )
goto out ;
2010-06-20 16:54:43 +04:00
u . lapic = kzalloc ( sizeof ( struct kvm_lapic_state ) , GFP_KERNEL ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
2008-08-11 21:01:47 +04:00
r = - ENOMEM ;
2010-06-20 16:54:43 +04:00
if ( ! u . lapic )
2008-08-11 21:01:47 +04:00
goto out ;
2010-06-20 16:54:43 +04:00
r = kvm_vcpu_ioctl_get_lapic ( vcpu , u . lapic ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
if ( r )
goto out ;
r = - EFAULT ;
2010-06-20 16:54:43 +04:00
if ( copy_to_user ( argp , u . lapic , sizeof ( struct kvm_lapic_state ) ) )
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
goto out ;
r = 0 ;
break ;
}
case KVM_SET_LAPIC : {
2009-10-29 18:44:16 +03:00
r = - EINVAL ;
if ( ! vcpu - > arch . apic )
goto out ;
2011-12-04 21:36:29 +04:00
u . lapic = memdup_user ( argp , sizeof ( * u . lapic ) ) ;
if ( IS_ERR ( u . lapic ) ) {
r = PTR_ERR ( u . lapic ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
goto out ;
2011-12-04 21:36:29 +04:00
}
2010-06-20 16:54:43 +04:00
r = kvm_vcpu_ioctl_set_lapic ( vcpu , u . lapic ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
if ( r )
goto out ;
r = 0 ;
break ;
}
2007-11-20 23:36:41 +03:00
case KVM_INTERRUPT : {
struct kvm_interrupt irq ;
r = - EFAULT ;
if ( copy_from_user ( & irq , argp , sizeof irq ) )
goto out ;
r = kvm_vcpu_ioctl_interrupt ( vcpu , & irq ) ;
if ( r )
goto out ;
r = 0 ;
break ;
}
2008-09-26 11:30:55 +04:00
case KVM_NMI : {
r = kvm_vcpu_ioctl_nmi ( vcpu ) ;
if ( r )
goto out ;
r = 0 ;
break ;
}
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
case KVM_SET_CPUID : {
struct kvm_cpuid __user * cpuid_arg = argp ;
struct kvm_cpuid cpuid ;
r = - EFAULT ;
if ( copy_from_user ( & cpuid , cpuid_arg , sizeof cpuid ) )
goto out ;
r = kvm_vcpu_ioctl_set_cpuid ( vcpu , & cpuid , cpuid_arg - > entries ) ;
if ( r )
goto out ;
break ;
}
2007-11-21 18:10:04 +03:00
case KVM_SET_CPUID2 : {
struct kvm_cpuid2 __user * cpuid_arg = argp ;
struct kvm_cpuid2 cpuid ;
r = - EFAULT ;
if ( copy_from_user ( & cpuid , cpuid_arg , sizeof cpuid ) )
goto out ;
r = kvm_vcpu_ioctl_set_cpuid2 ( vcpu , & cpuid ,
2009-01-14 19:56:00 +03:00
cpuid_arg - > entries ) ;
2007-11-21 18:10:04 +03:00
if ( r )
goto out ;
break ;
}
case KVM_GET_CPUID2 : {
struct kvm_cpuid2 __user * cpuid_arg = argp ;
struct kvm_cpuid2 cpuid ;
r = - EFAULT ;
if ( copy_from_user ( & cpuid , cpuid_arg , sizeof cpuid ) )
goto out ;
r = kvm_vcpu_ioctl_get_cpuid2 ( vcpu , & cpuid ,
2009-01-14 19:56:00 +03:00
cpuid_arg - > entries ) ;
2007-11-21 18:10:04 +03:00
if ( r )
goto out ;
r = - EFAULT ;
if ( copy_to_user ( cpuid_arg , & cpuid , sizeof cpuid ) )
goto out ;
r = 0 ;
break ;
}
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
case KVM_GET_MSRS :
r = msr_io ( vcpu , argp , kvm_get_msr , 1 ) ;
break ;
case KVM_SET_MSRS :
r = msr_io ( vcpu , argp , do_set_msr , 0 ) ;
break ;
2007-10-22 18:50:39 +04:00
case KVM_TPR_ACCESS_REPORTING : {
struct kvm_tpr_access_ctl tac ;
r = - EFAULT ;
if ( copy_from_user ( & tac , argp , sizeof tac ) )
goto out ;
r = vcpu_ioctl_tpr_access_reporting ( vcpu , & tac ) ;
if ( r )
goto out ;
r = - EFAULT ;
if ( copy_to_user ( argp , & tac , sizeof tac ) )
goto out ;
r = 0 ;
break ;
} ;
2007-10-25 18:52:32 +04:00
case KVM_SET_VAPIC_ADDR : {
struct kvm_vapic_addr va ;
r = - EINVAL ;
if ( ! irqchip_in_kernel ( vcpu - > kvm ) )
goto out ;
r = - EFAULT ;
if ( copy_from_user ( & va , argp , sizeof va ) )
goto out ;
r = 0 ;
kvm_lapic_set_vapic_addr ( vcpu , va . vapic_addr ) ;
break ;
}
2009-05-11 12:48:15 +04:00
case KVM_X86_SETUP_MCE : {
u64 mcg_cap ;
r = - EFAULT ;
if ( copy_from_user ( & mcg_cap , argp , sizeof mcg_cap ) )
goto out ;
r = kvm_vcpu_ioctl_x86_setup_mce ( vcpu , mcg_cap ) ;
break ;
}
case KVM_X86_SET_MCE : {
struct kvm_x86_mce mce ;
r = - EFAULT ;
if ( copy_from_user ( & mce , argp , sizeof mce ) )
goto out ;
r = kvm_vcpu_ioctl_x86_set_mce ( vcpu , & mce ) ;
break ;
}
2009-11-12 03:04:25 +03:00
case KVM_GET_VCPU_EVENTS : {
struct kvm_vcpu_events events ;
kvm_vcpu_ioctl_x86_get_vcpu_events ( vcpu , & events ) ;
r = - EFAULT ;
if ( copy_to_user ( argp , & events , sizeof ( struct kvm_vcpu_events ) ) )
break ;
r = 0 ;
break ;
}
case KVM_SET_VCPU_EVENTS : {
struct kvm_vcpu_events events ;
r = - EFAULT ;
if ( copy_from_user ( & events , argp , sizeof ( struct kvm_vcpu_events ) ) )
break ;
r = kvm_vcpu_ioctl_x86_set_vcpu_events ( vcpu , & events ) ;
break ;
}
2010-02-15 12:45:43 +03:00
case KVM_GET_DEBUGREGS : {
struct kvm_debugregs dbgregs ;
kvm_vcpu_ioctl_x86_get_debugregs ( vcpu , & dbgregs ) ;
r = - EFAULT ;
if ( copy_to_user ( argp , & dbgregs ,
sizeof ( struct kvm_debugregs ) ) )
break ;
r = 0 ;
break ;
}
case KVM_SET_DEBUGREGS : {
struct kvm_debugregs dbgregs ;
r = - EFAULT ;
if ( copy_from_user ( & dbgregs , argp ,
sizeof ( struct kvm_debugregs ) ) )
break ;
r = kvm_vcpu_ioctl_x86_set_debugregs ( vcpu , & dbgregs ) ;
break ;
}
2010-06-13 13:29:39 +04:00
case KVM_GET_XSAVE : {
2010-06-20 16:54:43 +04:00
u . xsave = kzalloc ( sizeof ( struct kvm_xsave ) , GFP_KERNEL ) ;
2010-06-13 13:29:39 +04:00
r = - ENOMEM ;
2010-06-20 16:54:43 +04:00
if ( ! u . xsave )
2010-06-13 13:29:39 +04:00
break ;
2010-06-20 16:54:43 +04:00
kvm_vcpu_ioctl_x86_get_xsave ( vcpu , u . xsave ) ;
2010-06-13 13:29:39 +04:00
r = - EFAULT ;
2010-06-20 16:54:43 +04:00
if ( copy_to_user ( argp , u . xsave , sizeof ( struct kvm_xsave ) ) )
2010-06-13 13:29:39 +04:00
break ;
r = 0 ;
break ;
}
case KVM_SET_XSAVE : {
2011-12-04 21:36:29 +04:00
u . xsave = memdup_user ( argp , sizeof ( * u . xsave ) ) ;
if ( IS_ERR ( u . xsave ) ) {
r = PTR_ERR ( u . xsave ) ;
goto out ;
}
2010-06-13 13:29:39 +04:00
2010-06-20 16:54:43 +04:00
r = kvm_vcpu_ioctl_x86_set_xsave ( vcpu , u . xsave ) ;
2010-06-13 13:29:39 +04:00
break ;
}
case KVM_GET_XCRS : {
2010-06-20 16:54:43 +04:00
u . xcrs = kzalloc ( sizeof ( struct kvm_xcrs ) , GFP_KERNEL ) ;
2010-06-13 13:29:39 +04:00
r = - ENOMEM ;
2010-06-20 16:54:43 +04:00
if ( ! u . xcrs )
2010-06-13 13:29:39 +04:00
break ;
2010-06-20 16:54:43 +04:00
kvm_vcpu_ioctl_x86_get_xcrs ( vcpu , u . xcrs ) ;
2010-06-13 13:29:39 +04:00
r = - EFAULT ;
2010-06-20 16:54:43 +04:00
if ( copy_to_user ( argp , u . xcrs ,
2010-06-13 13:29:39 +04:00
sizeof ( struct kvm_xcrs ) ) )
break ;
r = 0 ;
break ;
}
case KVM_SET_XCRS : {
2011-12-04 21:36:29 +04:00
u . xcrs = memdup_user ( argp , sizeof ( * u . xcrs ) ) ;
if ( IS_ERR ( u . xcrs ) ) {
r = PTR_ERR ( u . xcrs ) ;
goto out ;
}
2010-06-13 13:29:39 +04:00
2010-06-20 16:54:43 +04:00
r = kvm_vcpu_ioctl_x86_set_xcrs ( vcpu , u . xcrs ) ;
2010-06-13 13:29:39 +04:00
break ;
}
2011-03-25 11:44:51 +03:00
case KVM_SET_TSC_KHZ : {
u32 user_tsc_khz ;
r = - EINVAL ;
user_tsc_khz = ( u32 ) arg ;
if ( user_tsc_khz > = kvm_max_guest_tsc_khz )
goto out ;
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
if ( user_tsc_khz = = 0 )
user_tsc_khz = tsc_khz ;
kvm_set_tsc_khz ( vcpu , user_tsc_khz ) ;
2011-03-25 11:44:51 +03:00
r = 0 ;
goto out ;
}
case KVM_GET_TSC_KHZ : {
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
r = vcpu - > arch . virtual_tsc_khz ;
2011-03-25 11:44:51 +03:00
goto out ;
}
2012-03-10 23:37:27 +04:00
case KVM_KVMCLOCK_CTRL : {
r = kvm_set_guest_paused ( vcpu ) ;
goto out ;
}
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
default :
r = - EINVAL ;
}
out :
2010-06-20 16:54:43 +04:00
kfree ( u . buffer ) ;
KVM: Portability: split kvm_vcpu_ioctl
This patch splits kvm_vcpu_ioctl into archtecture independent parts, and
x86 specific parts which go to kvm_arch_vcpu_ioctl in x86.c.
Common ioctls for all architectures are:
KVM_RUN, KVM_GET/SET_(S-)REGS, KVM_TRANSLATE, KVM_INTERRUPT,
KVM_DEBUG_GUEST, KVM_SET_SIGNAL_MASK, KVM_GET/SET_FPU
Note that some PPC chips don't have an FPU, so we might need an #ifdef
around KVM_GET/SET_FPU one day.
x86 specific ioctls are:
KVM_GET/SET_LAPIC, KVM_SET_CPUID, KVM_GET/SET_MSRS
An interresting aspect is vcpu_load/vcpu_put. We now have a common
vcpu_load/put which does the preemption stuff, and an architecture
specific kvm_arch_vcpu_load/put. In the x86 case, this one calls the
vmx/svm function defined in kvm_x86_ops.
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-11 21:16:52 +04:00
return r ;
}
2012-01-04 13:25:23 +04:00
int kvm_arch_vcpu_fault ( struct kvm_vcpu * vcpu , struct vm_fault * vmf )
{
return VM_FAULT_SIGBUS ;
}
2007-10-29 18:08:35 +03:00
static int kvm_vm_ioctl_set_tss_addr ( struct kvm * kvm , unsigned long addr )
{
int ret ;
if ( addr > ( unsigned int ) ( - 3 * PAGE_SIZE ) )
return - 1 ;
ret = kvm_x86_ops - > set_tss_addr ( kvm , addr ) ;
return ret ;
}
2009-07-21 06:42:48 +04:00
static int kvm_vm_ioctl_set_identity_map_addr ( struct kvm * kvm ,
u64 ident_addr )
{
kvm - > arch . ept_identity_map_addr = ident_addr ;
return 0 ;
}
2007-10-29 18:08:35 +03:00
static int kvm_vm_ioctl_set_nr_mmu_pages ( struct kvm * kvm ,
u32 kvm_nr_mmu_pages )
{
if ( kvm_nr_mmu_pages < KVM_MIN_ALLOC_MMU_PAGES )
return - EINVAL ;
2009-12-23 19:35:26 +03:00
mutex_lock ( & kvm - > slots_lock ) ;
2009-05-13 01:55:43 +04:00
spin_lock ( & kvm - > mmu_lock ) ;
2007-10-29 18:08:35 +03:00
kvm_mmu_change_mmu_pages ( kvm , kvm_nr_mmu_pages ) ;
2007-12-14 05:01:48 +03:00
kvm - > arch . n_requested_mmu_pages = kvm_nr_mmu_pages ;
2007-10-29 18:08:35 +03:00
2009-05-13 01:55:43 +04:00
spin_unlock ( & kvm - > mmu_lock ) ;
2009-12-23 19:35:26 +03:00
mutex_unlock ( & kvm - > slots_lock ) ;
2007-10-29 18:08:35 +03:00
return 0 ;
}
static int kvm_vm_ioctl_get_nr_mmu_pages ( struct kvm * kvm )
{
2010-08-20 05:11:14 +04:00
return kvm - > arch . n_max_mmu_pages ;
2007-10-29 18:08:35 +03:00
}
static int kvm_vm_ioctl_get_irqchip ( struct kvm * kvm , struct kvm_irqchip * chip )
{
int r ;
r = 0 ;
switch ( chip - > chip_id ) {
case KVM_IRQCHIP_PIC_MASTER :
memcpy ( & chip - > chip . pic ,
& pic_irqchip ( kvm ) - > pics [ 0 ] ,
sizeof ( struct kvm_pic_state ) ) ;
break ;
case KVM_IRQCHIP_PIC_SLAVE :
memcpy ( & chip - > chip . pic ,
& pic_irqchip ( kvm ) - > pics [ 1 ] ,
sizeof ( struct kvm_pic_state ) ) ;
break ;
case KVM_IRQCHIP_IOAPIC :
2009-08-24 12:54:25 +04:00
r = kvm_get_ioapic ( kvm , & chip - > chip . ioapic ) ;
2007-10-29 18:08:35 +03:00
break ;
default :
r = - EINVAL ;
break ;
}
return r ;
}
static int kvm_vm_ioctl_set_irqchip ( struct kvm * kvm , struct kvm_irqchip * chip )
{
int r ;
r = 0 ;
switch ( chip - > chip_id ) {
case KVM_IRQCHIP_PIC_MASTER :
2010-09-19 20:44:07 +04:00
spin_lock ( & pic_irqchip ( kvm ) - > lock ) ;
2007-10-29 18:08:35 +03:00
memcpy ( & pic_irqchip ( kvm ) - > pics [ 0 ] ,
& chip - > chip . pic ,
sizeof ( struct kvm_pic_state ) ) ;
2010-09-19 20:44:07 +04:00
spin_unlock ( & pic_irqchip ( kvm ) - > lock ) ;
2007-10-29 18:08:35 +03:00
break ;
case KVM_IRQCHIP_PIC_SLAVE :
2010-09-19 20:44:07 +04:00
spin_lock ( & pic_irqchip ( kvm ) - > lock ) ;
2007-10-29 18:08:35 +03:00
memcpy ( & pic_irqchip ( kvm ) - > pics [ 1 ] ,
& chip - > chip . pic ,
sizeof ( struct kvm_pic_state ) ) ;
2010-09-19 20:44:07 +04:00
spin_unlock ( & pic_irqchip ( kvm ) - > lock ) ;
2007-10-29 18:08:35 +03:00
break ;
case KVM_IRQCHIP_IOAPIC :
2009-08-24 12:54:25 +04:00
r = kvm_set_ioapic ( kvm , & chip - > chip . ioapic ) ;
2007-10-29 18:08:35 +03:00
break ;
default :
r = - EINVAL ;
break ;
}
kvm_pic_update_irq ( pic_irqchip ( kvm ) ) ;
return r ;
}
2008-03-03 19:50:59 +03:00
static int kvm_vm_ioctl_get_pit ( struct kvm * kvm , struct kvm_pit_state * ps )
{
int r = 0 ;
2009-06-23 22:05:14 +04:00
mutex_lock ( & kvm - > arch . vpit - > pit_state . lock ) ;
2008-03-03 19:50:59 +03:00
memcpy ( ps , & kvm - > arch . vpit - > pit_state , sizeof ( struct kvm_pit_state ) ) ;
2009-06-23 22:05:14 +04:00
mutex_unlock ( & kvm - > arch . vpit - > pit_state . lock ) ;
2008-03-03 19:50:59 +03:00
return r ;
}
static int kvm_vm_ioctl_set_pit ( struct kvm * kvm , struct kvm_pit_state * ps )
{
int r = 0 ;
2009-06-23 22:05:14 +04:00
mutex_lock ( & kvm - > arch . vpit - > pit_state . lock ) ;
2008-03-03 19:50:59 +03:00
memcpy ( & kvm - > arch . vpit - > pit_state , ps , sizeof ( struct kvm_pit_state ) ) ;
2009-07-07 19:50:38 +04:00
kvm_pit_load_count ( kvm , 0 , ps - > channels [ 0 ] . count , 0 ) ;
mutex_unlock ( & kvm - > arch . vpit - > pit_state . lock ) ;
return r ;
}
static int kvm_vm_ioctl_get_pit2 ( struct kvm * kvm , struct kvm_pit_state2 * ps )
{
int r = 0 ;
mutex_lock ( & kvm - > arch . vpit - > pit_state . lock ) ;
memcpy ( ps - > channels , & kvm - > arch . vpit - > pit_state . channels ,
sizeof ( ps - > channels ) ) ;
ps - > flags = kvm - > arch . vpit - > pit_state . flags ;
mutex_unlock ( & kvm - > arch . vpit - > pit_state . lock ) ;
2010-10-30 22:54:47 +04:00
memset ( & ps - > reserved , 0 , sizeof ( ps - > reserved ) ) ;
2009-07-07 19:50:38 +04:00
return r ;
}
static int kvm_vm_ioctl_set_pit2 ( struct kvm * kvm , struct kvm_pit_state2 * ps )
{
int r = 0 , start = 0 ;
u32 prev_legacy , cur_legacy ;
mutex_lock ( & kvm - > arch . vpit - > pit_state . lock ) ;
prev_legacy = kvm - > arch . vpit - > pit_state . flags & KVM_PIT_FLAGS_HPET_LEGACY ;
cur_legacy = ps - > flags & KVM_PIT_FLAGS_HPET_LEGACY ;
if ( ! prev_legacy & & cur_legacy )
start = 1 ;
memcpy ( & kvm - > arch . vpit - > pit_state . channels , & ps - > channels ,
sizeof ( kvm - > arch . vpit - > pit_state . channels ) ) ;
kvm - > arch . vpit - > pit_state . flags = ps - > flags ;
kvm_pit_load_count ( kvm , 0 , kvm - > arch . vpit - > pit_state . channels [ 0 ] . count , start ) ;
2009-06-23 22:05:14 +04:00
mutex_unlock ( & kvm - > arch . vpit - > pit_state . lock ) ;
2008-03-03 19:50:59 +03:00
return r ;
}
2008-12-30 20:55:06 +03:00
static int kvm_vm_ioctl_reinject ( struct kvm * kvm ,
struct kvm_reinject_control * control )
{
if ( ! kvm - > arch . vpit )
return - ENXIO ;
2009-06-23 22:05:14 +04:00
mutex_lock ( & kvm - > arch . vpit - > pit_state . lock ) ;
2008-12-30 20:55:06 +03:00
kvm - > arch . vpit - > pit_state . pit_timer . reinject = control - > pit_reinject ;
2009-06-23 22:05:14 +04:00
mutex_unlock ( & kvm - > arch . vpit - > pit_state . lock ) ;
2008-12-30 20:55:06 +03:00
return 0 ;
}
2011-11-14 13:24:50 +04:00
/**
2012-03-03 09:21:48 +04:00
* kvm_vm_ioctl_get_dirty_log - get and clear the log of dirty pages in a slot
* @ kvm : kvm instance
* @ log : slot id and address to which we copy the log
2011-11-14 13:24:50 +04:00
*
2012-03-03 09:21:48 +04:00
* We need to keep it in mind that VCPU threads can write to the bitmap
* concurrently . So , to avoid losing data , we keep the following order for
* each bit :
2011-11-14 13:24:50 +04:00
*
2012-03-03 09:21:48 +04:00
* 1. Take a snapshot of the bit and clear it if needed .
* 2. Write protect the corresponding page .
* 3. Flush TLB ' s if needed .
* 4. Copy the snapshot to the userspace .
2011-11-14 13:24:50 +04:00
*
2012-03-03 09:21:48 +04:00
* Between 2 and 3 , the guest may write to the page using the remaining TLB
* entry . This is not a problem because the page will be reported dirty at
* step 4 using the snapshot taken before and step 3 ensures that successive
* writes will be logged for the next call .
2007-11-18 15:29:43 +03:00
*/
2012-03-03 09:21:48 +04:00
int kvm_vm_ioctl_get_dirty_log ( struct kvm * kvm , struct kvm_dirty_log * log )
2007-11-18 15:29:43 +03:00
{
2011-11-14 13:23:34 +04:00
int r ;
2007-11-18 15:29:43 +03:00
struct kvm_memory_slot * memslot ;
2012-03-03 09:21:48 +04:00
unsigned long n , i ;
unsigned long * dirty_bitmap ;
unsigned long * dirty_bitmap_buffer ;
bool is_dirty = false ;
2007-11-18 15:29:43 +03:00
2009-12-23 19:35:26 +03:00
mutex_lock ( & kvm - > slots_lock ) ;
2007-11-18 15:29:43 +03:00
2009-12-23 19:35:22 +03:00
r = - EINVAL ;
if ( log - > slot > = KVM_MEMORY_SLOTS )
goto out ;
2011-11-24 15:04:35 +04:00
memslot = id_to_memslot ( kvm - > memslots , log - > slot ) ;
2012-03-03 09:21:48 +04:00
dirty_bitmap = memslot - > dirty_bitmap ;
2009-12-23 19:35:22 +03:00
r = - ENOENT ;
2012-03-03 09:21:48 +04:00
if ( ! dirty_bitmap )
2009-12-23 19:35:22 +03:00
goto out ;
2010-04-12 14:35:35 +04:00
n = kvm_dirty_bitmap_bytes ( memslot ) ;
2009-12-23 19:35:22 +03:00
2012-03-03 09:21:48 +04:00
dirty_bitmap_buffer = dirty_bitmap + n / sizeof ( long ) ;
memset ( dirty_bitmap_buffer , 0 , n ) ;
2009-12-23 19:35:22 +03:00
2012-03-03 09:21:48 +04:00
spin_lock ( & kvm - > mmu_lock ) ;
2009-12-23 19:35:22 +03:00
2012-03-03 09:21:48 +04:00
for ( i = 0 ; i < n / sizeof ( long ) ; i + + ) {
unsigned long mask ;
gfn_t offset ;
2011-12-04 21:36:28 +04:00
2012-03-03 09:21:48 +04:00
if ( ! dirty_bitmap [ i ] )
continue ;
2009-12-23 19:35:22 +03:00
2012-03-03 09:21:48 +04:00
is_dirty = true ;
2010-04-28 13:50:36 +04:00
2012-03-03 09:21:48 +04:00
mask = xchg ( & dirty_bitmap [ i ] , 0 ) ;
dirty_bitmap_buffer [ i ] = mask ;
2010-10-25 05:21:24 +04:00
2012-03-03 09:21:48 +04:00
offset = i * BITS_PER_LONG ;
kvm_mmu_write_protect_pt_masked ( kvm , memslot , offset , mask ) ;
2007-11-18 15:29:43 +03:00
}
2012-03-03 09:21:48 +04:00
if ( is_dirty )
kvm_flush_remote_tlbs ( kvm ) ;
spin_unlock ( & kvm - > mmu_lock ) ;
r = - EFAULT ;
if ( copy_to_user ( log - > dirty_bitmap , dirty_bitmap_buffer , n ) )
goto out ;
2009-12-23 19:35:22 +03:00
2007-11-18 15:29:43 +03:00
r = 0 ;
out :
2009-12-23 19:35:26 +03:00
mutex_unlock ( & kvm - > slots_lock ) ;
2007-11-18 15:29:43 +03:00
return r ;
}
2007-10-29 18:08:35 +03:00
long kvm_arch_vm_ioctl ( struct file * filp ,
unsigned int ioctl , unsigned long arg )
{
struct kvm * kvm = filp - > private_data ;
void __user * argp = ( void __user * ) arg ;
2009-08-26 15:57:07 +04:00
int r = - ENOTTY ;
2008-08-11 21:01:45 +04:00
/*
* This union makes it completely explicit to gcc - 3. x
* that these two variables ' stack usage should be
* combined , not added together .
*/
union {
struct kvm_pit_state ps ;
2009-07-07 19:50:38 +04:00
struct kvm_pit_state2 ps2 ;
2009-05-15 00:42:53 +04:00
struct kvm_pit_config pit_config ;
2008-08-11 21:01:45 +04:00
} u ;
2007-10-29 18:08:35 +03:00
switch ( ioctl ) {
case KVM_SET_TSS_ADDR :
r = kvm_vm_ioctl_set_tss_addr ( kvm , arg ) ;
if ( r < 0 )
goto out ;
break ;
2009-07-21 06:42:48 +04:00
case KVM_SET_IDENTITY_MAP_ADDR : {
u64 ident_addr ;
r = - EFAULT ;
if ( copy_from_user ( & ident_addr , argp , sizeof ident_addr ) )
goto out ;
r = kvm_vm_ioctl_set_identity_map_addr ( kvm , ident_addr ) ;
if ( r < 0 )
goto out ;
break ;
}
2007-10-29 18:08:35 +03:00
case KVM_SET_NR_MMU_PAGES :
r = kvm_vm_ioctl_set_nr_mmu_pages ( kvm , arg ) ;
if ( r )
goto out ;
break ;
case KVM_GET_NR_MMU_PAGES :
r = kvm_vm_ioctl_get_nr_mmu_pages ( kvm ) ;
break ;
2009-10-29 18:44:15 +03:00
case KVM_CREATE_IRQCHIP : {
struct kvm_pic * vpic ;
mutex_lock ( & kvm - > lock ) ;
r = - EEXIST ;
if ( kvm - > arch . vpic )
goto create_irqchip_unlock ;
2012-03-05 16:23:29 +04:00
r = - EINVAL ;
if ( atomic_read ( & kvm - > online_vcpus ) )
goto create_irqchip_unlock ;
2007-10-29 18:08:35 +03:00
r = - ENOMEM ;
2009-10-29 18:44:15 +03:00
vpic = kvm_create_pic ( kvm ) ;
if ( vpic ) {
2007-10-29 18:08:35 +03:00
r = kvm_ioapic_init ( kvm ) ;
if ( r ) {
2010-12-15 19:41:37 +03:00
mutex_lock ( & kvm - > slots_lock ) ;
2010-02-09 05:33:03 +03:00
kvm_io_bus_unregister_dev ( kvm , KVM_PIO_BUS ,
2011-07-27 17:00:48 +04:00
& vpic - > dev_master ) ;
kvm_io_bus_unregister_dev ( kvm , KVM_PIO_BUS ,
& vpic - > dev_slave ) ;
kvm_io_bus_unregister_dev ( kvm , KVM_PIO_BUS ,
& vpic - > dev_eclr ) ;
2010-12-15 19:41:37 +03:00
mutex_unlock ( & kvm - > slots_lock ) ;
2009-10-29 18:44:15 +03:00
kfree ( vpic ) ;
goto create_irqchip_unlock ;
2007-10-29 18:08:35 +03:00
}
} else
2009-10-29 18:44:15 +03:00
goto create_irqchip_unlock ;
smp_wmb ( ) ;
kvm - > arch . vpic = vpic ;
smp_wmb ( ) ;
2008-11-19 14:58:46 +03:00
r = kvm_setup_default_irq_routing ( kvm ) ;
if ( r ) {
2010-12-15 19:41:37 +03:00
mutex_lock ( & kvm - > slots_lock ) ;
2009-10-29 18:44:15 +03:00
mutex_lock ( & kvm - > irq_lock ) ;
2010-02-09 05:33:03 +03:00
kvm_ioapic_destroy ( kvm ) ;
kvm_destroy_pic ( kvm ) ;
2009-10-29 18:44:15 +03:00
mutex_unlock ( & kvm - > irq_lock ) ;
2010-12-15 19:41:37 +03:00
mutex_unlock ( & kvm - > slots_lock ) ;
2008-11-19 14:58:46 +03:00
}
2009-10-29 18:44:15 +03:00
create_irqchip_unlock :
mutex_unlock ( & kvm - > lock ) ;
2007-10-29 18:08:35 +03:00
break ;
2009-10-29 18:44:15 +03:00
}
2008-01-28 00:10:22 +03:00
case KVM_CREATE_PIT :
2009-05-15 00:42:53 +04:00
u . pit_config . flags = KVM_PIT_SPEAKER_DUMMY ;
goto create_pit ;
case KVM_CREATE_PIT2 :
r = - EFAULT ;
if ( copy_from_user ( & u . pit_config , argp ,
sizeof ( struct kvm_pit_config ) ) )
goto out ;
create_pit :
2009-12-23 19:35:26 +03:00
mutex_lock ( & kvm - > slots_lock ) ;
2009-01-05 16:21:42 +03:00
r = - EEXIST ;
if ( kvm - > arch . vpit )
goto create_pit_unlock ;
2008-01-28 00:10:22 +03:00
r = - ENOMEM ;
2009-05-15 00:42:53 +04:00
kvm - > arch . vpit = kvm_create_pit ( kvm , u . pit_config . flags ) ;
2008-01-28 00:10:22 +03:00
if ( kvm - > arch . vpit )
r = 0 ;
2009-01-05 16:21:42 +03:00
create_pit_unlock :
2009-12-23 19:35:26 +03:00
mutex_unlock ( & kvm - > slots_lock ) ;
2008-01-28 00:10:22 +03:00
break ;
2009-02-04 18:28:14 +03:00
case KVM_IRQ_LINE_STATUS :
2007-10-29 18:08:35 +03:00
case KVM_IRQ_LINE : {
struct kvm_irq_level irq_event ;
r = - EFAULT ;
if ( copy_from_user ( & irq_event , argp , sizeof irq_event ) )
goto out ;
2010-03-12 05:09:45 +03:00
r = - ENXIO ;
2007-10-29 18:08:35 +03:00
if ( irqchip_in_kernel ( kvm ) ) {
2009-02-04 18:28:14 +03:00
__s32 status ;
status = kvm_set_irq ( kvm , KVM_USERSPACE_IRQ_SOURCE_ID ,
irq_event . irq , irq_event . level ) ;
if ( ioctl = = KVM_IRQ_LINE_STATUS ) {
2010-03-12 05:09:45 +03:00
r = - EFAULT ;
2009-02-04 18:28:14 +03:00
irq_event . status = status ;
if ( copy_to_user ( argp , & irq_event ,
sizeof irq_event ) )
goto out ;
}
2007-10-29 18:08:35 +03:00
r = 0 ;
}
break ;
}
case KVM_GET_IRQCHIP : {
/* 0: PIC master, 1: PIC slave, 2: IOAPIC */
2011-12-04 21:36:29 +04:00
struct kvm_irqchip * chip ;
2007-10-29 18:08:35 +03:00
2011-12-04 21:36:29 +04:00
chip = memdup_user ( argp , sizeof ( * chip ) ) ;
if ( IS_ERR ( chip ) ) {
r = PTR_ERR ( chip ) ;
2007-10-29 18:08:35 +03:00
goto out ;
2011-12-04 21:36:29 +04:00
}
2007-10-29 18:08:35 +03:00
r = - ENXIO ;
if ( ! irqchip_in_kernel ( kvm ) )
2008-08-11 21:01:45 +04:00
goto get_irqchip_out ;
r = kvm_vm_ioctl_get_irqchip ( kvm , chip ) ;
2007-10-29 18:08:35 +03:00
if ( r )
2008-08-11 21:01:45 +04:00
goto get_irqchip_out ;
2007-10-29 18:08:35 +03:00
r = - EFAULT ;
2008-08-11 21:01:45 +04:00
if ( copy_to_user ( argp , chip , sizeof * chip ) )
goto get_irqchip_out ;
2007-10-29 18:08:35 +03:00
r = 0 ;
2008-08-11 21:01:45 +04:00
get_irqchip_out :
kfree ( chip ) ;
if ( r )
goto out ;
2007-10-29 18:08:35 +03:00
break ;
}
case KVM_SET_IRQCHIP : {
/* 0: PIC master, 1: PIC slave, 2: IOAPIC */
2011-12-04 21:36:29 +04:00
struct kvm_irqchip * chip ;
2007-10-29 18:08:35 +03:00
2011-12-04 21:36:29 +04:00
chip = memdup_user ( argp , sizeof ( * chip ) ) ;
if ( IS_ERR ( chip ) ) {
r = PTR_ERR ( chip ) ;
2007-10-29 18:08:35 +03:00
goto out ;
2011-12-04 21:36:29 +04:00
}
2007-10-29 18:08:35 +03:00
r = - ENXIO ;
if ( ! irqchip_in_kernel ( kvm ) )
2008-08-11 21:01:45 +04:00
goto set_irqchip_out ;
r = kvm_vm_ioctl_set_irqchip ( kvm , chip ) ;
2007-10-29 18:08:35 +03:00
if ( r )
2008-08-11 21:01:45 +04:00
goto set_irqchip_out ;
2007-10-29 18:08:35 +03:00
r = 0 ;
2008-08-11 21:01:45 +04:00
set_irqchip_out :
kfree ( chip ) ;
if ( r )
goto out ;
2007-10-29 18:08:35 +03:00
break ;
}
2008-03-03 19:50:59 +03:00
case KVM_GET_PIT : {
r = - EFAULT ;
2008-08-11 21:01:45 +04:00
if ( copy_from_user ( & u . ps , argp , sizeof ( struct kvm_pit_state ) ) )
2008-03-03 19:50:59 +03:00
goto out ;
r = - ENXIO ;
if ( ! kvm - > arch . vpit )
goto out ;
2008-08-11 21:01:45 +04:00
r = kvm_vm_ioctl_get_pit ( kvm , & u . ps ) ;
2008-03-03 19:50:59 +03:00
if ( r )
goto out ;
r = - EFAULT ;
2008-08-11 21:01:45 +04:00
if ( copy_to_user ( argp , & u . ps , sizeof ( struct kvm_pit_state ) ) )
2008-03-03 19:50:59 +03:00
goto out ;
r = 0 ;
break ;
}
case KVM_SET_PIT : {
r = - EFAULT ;
2008-08-11 21:01:45 +04:00
if ( copy_from_user ( & u . ps , argp , sizeof u . ps ) )
2008-03-03 19:50:59 +03:00
goto out ;
r = - ENXIO ;
if ( ! kvm - > arch . vpit )
goto out ;
2008-08-11 21:01:45 +04:00
r = kvm_vm_ioctl_set_pit ( kvm , & u . ps ) ;
2008-03-03 19:50:59 +03:00
if ( r )
goto out ;
r = 0 ;
break ;
}
2009-07-07 19:50:38 +04:00
case KVM_GET_PIT2 : {
r = - ENXIO ;
if ( ! kvm - > arch . vpit )
goto out ;
r = kvm_vm_ioctl_get_pit2 ( kvm , & u . ps2 ) ;
if ( r )
goto out ;
r = - EFAULT ;
if ( copy_to_user ( argp , & u . ps2 , sizeof ( u . ps2 ) ) )
goto out ;
r = 0 ;
break ;
}
case KVM_SET_PIT2 : {
r = - EFAULT ;
if ( copy_from_user ( & u . ps2 , argp , sizeof ( u . ps2 ) ) )
goto out ;
r = - ENXIO ;
if ( ! kvm - > arch . vpit )
goto out ;
r = kvm_vm_ioctl_set_pit2 ( kvm , & u . ps2 ) ;
if ( r )
goto out ;
r = 0 ;
break ;
}
2008-12-30 20:55:06 +03:00
case KVM_REINJECT_CONTROL : {
struct kvm_reinject_control control ;
r = - EFAULT ;
if ( copy_from_user ( & control , argp , sizeof ( control ) ) )
goto out ;
r = kvm_vm_ioctl_reinject ( kvm , & control ) ;
if ( r )
goto out ;
r = 0 ;
break ;
}
2009-10-16 02:21:43 +04:00
case KVM_XEN_HVM_CONFIG : {
r = - EFAULT ;
if ( copy_from_user ( & kvm - > arch . xen_hvm_config , argp ,
sizeof ( struct kvm_xen_hvm_config ) ) )
goto out ;
r = - EINVAL ;
if ( kvm - > arch . xen_hvm_config . flags )
goto out ;
r = 0 ;
break ;
}
2009-10-16 23:28:36 +04:00
case KVM_SET_CLOCK : {
struct kvm_clock_data user_ns ;
u64 now_ns ;
s64 delta ;
r = - EFAULT ;
if ( copy_from_user ( & user_ns , argp , sizeof ( user_ns ) ) )
goto out ;
r = - EINVAL ;
if ( user_ns . flags )
goto out ;
r = 0 ;
2010-10-04 14:55:49 +04:00
local_irq_disable ( ) ;
2010-08-20 12:07:25 +04:00
now_ns = get_kernel_ns ( ) ;
2009-10-16 23:28:36 +04:00
delta = user_ns . clock - now_ns ;
2010-10-04 14:55:49 +04:00
local_irq_enable ( ) ;
2009-10-16 23:28:36 +04:00
kvm - > arch . kvmclock_offset = delta ;
break ;
}
case KVM_GET_CLOCK : {
struct kvm_clock_data user_ns ;
u64 now_ns ;
2010-10-04 14:55:49 +04:00
local_irq_disable ( ) ;
2010-08-20 12:07:25 +04:00
now_ns = get_kernel_ns ( ) ;
2009-10-16 23:28:36 +04:00
user_ns . clock = kvm - > arch . kvmclock_offset + now_ns ;
2010-10-04 14:55:49 +04:00
local_irq_enable ( ) ;
2009-10-16 23:28:36 +04:00
user_ns . flags = 0 ;
2010-10-30 22:54:47 +04:00
memset ( & user_ns . pad , 0 , sizeof ( user_ns . pad ) ) ;
2009-10-16 23:28:36 +04:00
r = - EFAULT ;
if ( copy_to_user ( argp , & user_ns , sizeof ( user_ns ) ) )
goto out ;
r = 0 ;
break ;
}
2007-10-29 18:08:35 +03:00
default :
;
}
out :
return r ;
}
2007-11-16 09:38:21 +03:00
static void kvm_init_msr_list ( void )
2007-10-10 19:16:19 +04:00
{
u32 dummy [ 2 ] ;
unsigned i , j ;
2009-10-06 21:24:50 +04:00
/* skip the first msrs in the list. KVM-specific */
for ( i = j = KVM_SAVE_MSRS_BEGIN ; i < ARRAY_SIZE ( msrs_to_save ) ; i + + ) {
2007-10-10 19:16:19 +04:00
if ( rdmsr_safe ( msrs_to_save [ i ] , & dummy [ 0 ] , & dummy [ 1 ] ) < 0 )
continue ;
if ( j < i )
msrs_to_save [ j ] = msrs_to_save [ i ] ;
j + + ;
}
num_msrs_to_save = j ;
}
2009-06-29 23:24:32 +04:00
static int vcpu_mmio_write ( struct kvm_vcpu * vcpu , gpa_t addr , int len ,
const void * v )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
2010-01-19 13:51:22 +03:00
int handled = 0 ;
int n ;
do {
n = min ( len , 8 ) ;
if ( ! ( vcpu - > arch . apic & &
! kvm_iodevice_write ( & vcpu - > arch . apic - > dev , addr , n , v ) )
& & kvm_io_bus_write ( vcpu - > kvm , KVM_MMIO_BUS , addr , n , v ) )
break ;
handled + = n ;
addr + = n ;
len - = n ;
v + = n ;
} while ( len ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
2010-01-19 13:51:22 +03:00
return handled ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
2009-06-29 23:24:32 +04:00
static int vcpu_mmio_read ( struct kvm_vcpu * vcpu , gpa_t addr , int len , void * v )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
2010-01-19 13:51:22 +03:00
int handled = 0 ;
int n ;
do {
n = min ( len , 8 ) ;
if ( ! ( vcpu - > arch . apic & &
! kvm_iodevice_read ( & vcpu - > arch . apic - > dev , addr , n , v ) )
& & kvm_io_bus_read ( vcpu - > kvm , KVM_MMIO_BUS , addr , n , v ) )
break ;
trace_kvm_mmio ( KVM_TRACE_MMIO_READ , n , addr , * ( u64 * ) v ) ;
handled + = n ;
addr + = n ;
len - = n ;
v + = n ;
} while ( len ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
2010-01-19 13:51:22 +03:00
return handled ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
2010-03-18 16:20:16 +03:00
static void kvm_set_segment ( struct kvm_vcpu * vcpu ,
struct kvm_segment * var , int seg )
{
kvm_x86_ops - > set_segment ( vcpu , var , seg ) ;
}
void kvm_get_segment ( struct kvm_vcpu * vcpu ,
struct kvm_segment * var , int seg )
{
kvm_x86_ops - > get_segment ( vcpu , var , seg ) ;
}
2011-11-28 16:42:16 +04:00
gpa_t translate_nested_gpa ( struct kvm_vcpu * vcpu , gpa_t gpa , u32 access )
2010-09-10 19:30:54 +04:00
{
gpa_t t_gpa ;
2010-11-22 18:53:26 +03:00
struct x86_exception exception ;
2010-09-10 19:30:54 +04:00
BUG_ON ( ! mmu_is_nested ( vcpu ) ) ;
/* NPT walks are always user-walks */
access | = PFERR_USER_MASK ;
2010-11-22 18:53:26 +03:00
t_gpa = vcpu - > arch . mmu . gva_to_gpa ( vcpu , gpa , access , & exception ) ;
2010-09-10 19:30:54 +04:00
return t_gpa ;
}
2010-11-22 18:53:26 +03:00
gpa_t kvm_mmu_gva_to_gpa_read ( struct kvm_vcpu * vcpu , gva_t gva ,
struct x86_exception * exception )
2010-02-10 15:21:32 +03:00
{
u32 access = ( kvm_x86_ops - > get_cpl ( vcpu ) = = 3 ) ? PFERR_USER_MASK : 0 ;
2010-11-22 18:53:26 +03:00
return vcpu - > arch . walk_mmu - > gva_to_gpa ( vcpu , gva , access , exception ) ;
2010-02-10 15:21:32 +03:00
}
2010-11-22 18:53:26 +03:00
gpa_t kvm_mmu_gva_to_gpa_fetch ( struct kvm_vcpu * vcpu , gva_t gva ,
struct x86_exception * exception )
2010-02-10 15:21:32 +03:00
{
u32 access = ( kvm_x86_ops - > get_cpl ( vcpu ) = = 3 ) ? PFERR_USER_MASK : 0 ;
access | = PFERR_FETCH_MASK ;
2010-11-22 18:53:26 +03:00
return vcpu - > arch . walk_mmu - > gva_to_gpa ( vcpu , gva , access , exception ) ;
2010-02-10 15:21:32 +03:00
}
2010-11-22 18:53:26 +03:00
gpa_t kvm_mmu_gva_to_gpa_write ( struct kvm_vcpu * vcpu , gva_t gva ,
struct x86_exception * exception )
2010-02-10 15:21:32 +03:00
{
u32 access = ( kvm_x86_ops - > get_cpl ( vcpu ) = = 3 ) ? PFERR_USER_MASK : 0 ;
access | = PFERR_WRITE_MASK ;
2010-11-22 18:53:26 +03:00
return vcpu - > arch . walk_mmu - > gva_to_gpa ( vcpu , gva , access , exception ) ;
2010-02-10 15:21:32 +03:00
}
/* uses this to access any guest's mapped memory without checking CPL */
2010-11-22 18:53:26 +03:00
gpa_t kvm_mmu_gva_to_gpa_system ( struct kvm_vcpu * vcpu , gva_t gva ,
struct x86_exception * exception )
2010-02-10 15:21:32 +03:00
{
2010-11-22 18:53:26 +03:00
return vcpu - > arch . walk_mmu - > gva_to_gpa ( vcpu , gva , 0 , exception ) ;
2010-02-10 15:21:32 +03:00
}
static int kvm_read_guest_virt_helper ( gva_t addr , void * val , unsigned int bytes ,
struct kvm_vcpu * vcpu , u32 access ,
2010-11-22 18:53:22 +03:00
struct x86_exception * exception )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
void * data = val ;
2007-12-21 03:18:22 +03:00
int r = X86EMUL_CONTINUE ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
while ( bytes ) {
2010-09-10 19:30:49 +04:00
gpa_t gpa = vcpu - > arch . walk_mmu - > gva_to_gpa ( vcpu , addr , access ,
2010-11-22 18:53:26 +03:00
exception ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
unsigned offset = addr & ( PAGE_SIZE - 1 ) ;
2008-12-29 02:42:19 +03:00
unsigned toread = min ( bytes , ( unsigned ) PAGE_SIZE - offset ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
int ret ;
2010-11-22 18:53:22 +03:00
if ( gpa = = UNMAPPED_GVA )
2010-11-22 18:53:26 +03:00
return X86EMUL_PROPAGATE_FAULT ;
2008-12-29 02:42:19 +03:00
ret = kvm_read_guest ( vcpu - > kvm , gpa , data , toread ) ;
2007-12-21 03:18:22 +03:00
if ( ret < 0 ) {
2010-04-28 20:15:35 +04:00
r = X86EMUL_IO_NEEDED ;
2007-12-21 03:18:22 +03:00
goto out ;
}
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
2008-12-29 02:42:19 +03:00
bytes - = toread ;
data + = toread ;
addr + = toread ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
2007-12-21 03:18:22 +03:00
out :
return r ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
2008-12-29 02:42:19 +03:00
2010-02-10 15:21:32 +03:00
/* used for instruction fetching */
2011-04-20 14:37:53 +04:00
static int kvm_fetch_guest_virt ( struct x86_emulate_ctxt * ctxt ,
gva_t addr , void * val , unsigned int bytes ,
2010-11-22 18:53:22 +03:00
struct x86_exception * exception )
2010-02-10 15:21:32 +03:00
{
2011-04-20 14:37:53 +04:00
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
2010-02-10 15:21:32 +03:00
u32 access = ( kvm_x86_ops - > get_cpl ( vcpu ) = = 3 ) ? PFERR_USER_MASK : 0 ;
2011-04-20 14:37:53 +04:00
2010-02-10 15:21:32 +03:00
return kvm_read_guest_virt_helper ( addr , val , bytes , vcpu ,
2010-11-22 18:53:22 +03:00
access | PFERR_FETCH_MASK ,
exception ) ;
2010-02-10 15:21:32 +03:00
}
2011-05-26 00:04:56 +04:00
int kvm_read_guest_virt ( struct x86_emulate_ctxt * ctxt ,
2011-04-20 14:37:53 +04:00
gva_t addr , void * val , unsigned int bytes ,
2010-11-22 18:53:22 +03:00
struct x86_exception * exception )
2010-02-10 15:21:32 +03:00
{
2011-04-20 14:37:53 +04:00
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
2010-02-10 15:21:32 +03:00
u32 access = ( kvm_x86_ops - > get_cpl ( vcpu ) = = 3 ) ? PFERR_USER_MASK : 0 ;
2011-04-20 14:37:53 +04:00
2010-02-10 15:21:32 +03:00
return kvm_read_guest_virt_helper ( addr , val , bytes , vcpu , access ,
2010-11-22 18:53:22 +03:00
exception ) ;
2010-02-10 15:21:32 +03:00
}
2011-05-26 00:04:56 +04:00
EXPORT_SYMBOL_GPL ( kvm_read_guest_virt ) ;
2010-02-10 15:21:32 +03:00
2011-04-20 14:37:53 +04:00
static int kvm_read_guest_virt_system ( struct x86_emulate_ctxt * ctxt ,
gva_t addr , void * val , unsigned int bytes ,
2010-11-22 18:53:22 +03:00
struct x86_exception * exception )
2010-02-10 15:21:32 +03:00
{
2011-04-20 14:37:53 +04:00
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
2010-11-22 18:53:22 +03:00
return kvm_read_guest_virt_helper ( addr , val , bytes , vcpu , 0 , exception ) ;
2010-02-10 15:21:32 +03:00
}
2011-05-26 00:08:00 +04:00
int kvm_write_guest_virt_system ( struct x86_emulate_ctxt * ctxt ,
2011-04-20 14:37:53 +04:00
gva_t addr , void * val ,
2010-03-18 16:20:16 +03:00
unsigned int bytes ,
2010-11-22 18:53:22 +03:00
struct x86_exception * exception )
2008-12-29 02:42:19 +03:00
{
2011-04-20 14:37:53 +04:00
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
2008-12-29 02:42:19 +03:00
void * data = val ;
int r = X86EMUL_CONTINUE ;
while ( bytes ) {
2010-09-10 19:30:49 +04:00
gpa_t gpa = vcpu - > arch . walk_mmu - > gva_to_gpa ( vcpu , addr ,
PFERR_WRITE_MASK ,
2010-11-22 18:53:26 +03:00
exception ) ;
2008-12-29 02:42:19 +03:00
unsigned offset = addr & ( PAGE_SIZE - 1 ) ;
unsigned towrite = min ( bytes , ( unsigned ) PAGE_SIZE - offset ) ;
int ret ;
2010-11-22 18:53:22 +03:00
if ( gpa = = UNMAPPED_GVA )
2010-11-22 18:53:26 +03:00
return X86EMUL_PROPAGATE_FAULT ;
2008-12-29 02:42:19 +03:00
ret = kvm_write_guest ( vcpu - > kvm , gpa , data , towrite ) ;
if ( ret < 0 ) {
2010-04-28 20:15:35 +04:00
r = X86EMUL_IO_NEEDED ;
2008-12-29 02:42:19 +03:00
goto out ;
}
bytes - = towrite ;
data + = towrite ;
addr + = towrite ;
}
out :
return r ;
}
2011-05-26 00:08:00 +04:00
EXPORT_SYMBOL_GPL ( kvm_write_guest_virt_system ) ;
2008-12-29 02:42:19 +03:00
2011-07-11 23:22:46 +04:00
static int vcpu_mmio_gva_to_gpa ( struct kvm_vcpu * vcpu , unsigned long gva ,
gpa_t * gpa , struct x86_exception * exception ,
bool write )
{
u32 access = ( kvm_x86_ops - > get_cpl ( vcpu ) = = 3 ) ? PFERR_USER_MASK : 0 ;
2011-07-11 23:23:20 +04:00
if ( vcpu_match_mmio_gva ( vcpu , gva ) & &
check_write_user_access ( vcpu , write , access ,
vcpu - > arch . access ) ) {
* gpa = vcpu - > arch . mmio_gfn < < PAGE_SHIFT |
( gva & ( PAGE_SIZE - 1 ) ) ;
2011-07-11 23:34:24 +04:00
trace_vcpu_match_mmio ( gva , * gpa , write , false ) ;
2011-07-11 23:23:20 +04:00
return 1 ;
}
2011-07-11 23:22:46 +04:00
if ( write )
access | = PFERR_WRITE_MASK ;
* gpa = vcpu - > arch . walk_mmu - > gva_to_gpa ( vcpu , gva , access , exception ) ;
if ( * gpa = = UNMAPPED_GVA )
return - 1 ;
/* For APIC access vmexit */
if ( ( * gpa & PAGE_MASK ) = = APIC_DEFAULT_PHYS_BASE )
return 1 ;
2011-07-11 23:34:24 +04:00
if ( vcpu_match_mmio_gpa ( vcpu , * gpa ) ) {
trace_vcpu_match_mmio ( gva , * gpa , write , true ) ;
2011-07-11 23:23:20 +04:00
return 1 ;
2011-07-11 23:34:24 +04:00
}
2011-07-11 23:23:20 +04:00
2011-07-11 23:22:46 +04:00
return 0 ;
}
2008-03-30 03:17:59 +04:00
int emulator_write_phys ( struct kvm_vcpu * vcpu , gpa_t gpa ,
2010-11-22 18:53:22 +03:00
const void * val , int bytes )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
int ret ;
ret = kvm_write_guest ( vcpu - > kvm , gpa , val , bytes ) ;
2008-03-02 15:06:05 +03:00
if ( ret < 0 )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
return 0 ;
2011-09-22 12:56:39 +04:00
kvm_mmu_pte_write ( vcpu , gpa , val , bytes ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
return 1 ;
}
2011-07-13 10:31:50 +04:00
struct read_write_emulator_ops {
int ( * read_write_prepare ) ( struct kvm_vcpu * vcpu , void * val ,
int bytes ) ;
int ( * read_write_emulate ) ( struct kvm_vcpu * vcpu , gpa_t gpa ,
void * val , int bytes ) ;
int ( * read_write_mmio ) ( struct kvm_vcpu * vcpu , gpa_t gpa ,
int bytes , void * val ) ;
int ( * read_write_exit_mmio ) ( struct kvm_vcpu * vcpu , gpa_t gpa ,
void * val , int bytes ) ;
bool write ;
} ;
static int read_prepare ( struct kvm_vcpu * vcpu , void * val , int bytes )
{
if ( vcpu - > mmio_read_completed ) {
trace_kvm_mmio ( KVM_TRACE_MMIO_READ , bytes ,
2012-04-18 20:22:47 +04:00
vcpu - > mmio_fragments [ 0 ] . gpa , * ( u64 * ) val ) ;
2011-07-13 10:31:50 +04:00
vcpu - > mmio_read_completed = 0 ;
return 1 ;
}
return 0 ;
}
static int read_emulate ( struct kvm_vcpu * vcpu , gpa_t gpa ,
void * val , int bytes )
{
return ! kvm_read_guest ( vcpu - > kvm , gpa , val , bytes ) ;
}
static int write_emulate ( struct kvm_vcpu * vcpu , gpa_t gpa ,
void * val , int bytes )
{
return emulator_write_phys ( vcpu , gpa , val , bytes ) ;
}
static int write_mmio ( struct kvm_vcpu * vcpu , gpa_t gpa , int bytes , void * val )
{
trace_kvm_mmio ( KVM_TRACE_MMIO_WRITE , bytes , gpa , * ( u64 * ) val ) ;
return vcpu_mmio_write ( vcpu , gpa , bytes , val ) ;
}
static int read_exit_mmio ( struct kvm_vcpu * vcpu , gpa_t gpa ,
void * val , int bytes )
{
trace_kvm_mmio ( KVM_TRACE_MMIO_READ_UNSATISFIED , bytes , gpa , 0 ) ;
return X86EMUL_IO_NEEDED ;
}
static int write_exit_mmio ( struct kvm_vcpu * vcpu , gpa_t gpa ,
void * val , int bytes )
{
2012-04-18 20:22:47 +04:00
struct kvm_mmio_fragment * frag = & vcpu - > mmio_fragments [ 0 ] ;
memcpy ( vcpu - > run - > mmio . data , frag - > data , frag - > len ) ;
2011-07-13 10:31:50 +04:00
return X86EMUL_CONTINUE ;
}
static struct read_write_emulator_ops read_emultor = {
. read_write_prepare = read_prepare ,
. read_write_emulate = read_emulate ,
. read_write_mmio = vcpu_mmio_read ,
. read_write_exit_mmio = read_exit_mmio ,
} ;
static struct read_write_emulator_ops write_emultor = {
. read_write_emulate = write_emulate ,
. read_write_mmio = write_mmio ,
. read_write_exit_mmio = write_exit_mmio ,
. write = true ,
} ;
2011-07-13 10:32:31 +04:00
static int emulator_read_write_onepage ( unsigned long addr , void * val ,
unsigned int bytes ,
struct x86_exception * exception ,
struct kvm_vcpu * vcpu ,
struct read_write_emulator_ops * ops )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
2011-07-11 23:22:46 +04:00
gpa_t gpa ;
int handled , ret ;
2011-07-13 10:32:31 +04:00
bool write = ops - > write ;
2012-04-18 20:22:47 +04:00
struct kvm_mmio_fragment * frag ;
2007-12-21 03:18:22 +03:00
2011-07-13 10:32:31 +04:00
ret = vcpu_mmio_gva_to_gpa ( vcpu , addr , & gpa , exception , write ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
2011-07-11 23:22:46 +04:00
if ( ret < 0 )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
return X86EMUL_PROPAGATE_FAULT ;
/* For APIC access vmexit */
2011-07-11 23:22:46 +04:00
if ( ret )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
goto mmio ;
2011-07-13 10:32:31 +04:00
if ( ops - > read_write_emulate ( vcpu , gpa , val , bytes ) )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
return X86EMUL_CONTINUE ;
mmio :
/*
* Is this MMIO handled locally ?
*/
2011-07-13 10:32:31 +04:00
handled = ops - > read_write_mmio ( vcpu , gpa , bytes , val ) ;
2010-01-19 13:51:22 +03:00
if ( handled = = bytes )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
return X86EMUL_CONTINUE ;
2010-01-19 13:51:22 +03:00
gpa + = handled ;
bytes - = handled ;
val + = handled ;
2012-04-18 20:22:47 +04:00
while ( bytes ) {
unsigned now = min ( bytes , 8U ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
2012-04-18 20:22:47 +04:00
frag = & vcpu - > mmio_fragments [ vcpu - > mmio_nr_fragments + + ] ;
frag - > gpa = gpa ;
frag - > data = val ;
frag - > len = now ;
gpa + = now ;
val + = now ;
bytes - = now ;
}
return X86EMUL_CONTINUE ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
2011-07-13 10:32:31 +04:00
int emulator_read_write ( struct x86_emulate_ctxt * ctxt , unsigned long addr ,
void * val , unsigned int bytes ,
struct x86_exception * exception ,
struct read_write_emulator_ops * ops )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
2011-04-20 14:37:53 +04:00
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
2012-04-18 20:22:47 +04:00
gpa_t gpa ;
int rc ;
if ( ops - > read_write_prepare & &
ops - > read_write_prepare ( vcpu , val , bytes ) )
return X86EMUL_CONTINUE ;
vcpu - > mmio_nr_fragments = 0 ;
2011-04-20 14:37:53 +04:00
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
/* Crossing a page boundary? */
if ( ( ( addr + bytes - 1 ) ^ addr ) & PAGE_MASK ) {
2012-04-18 20:22:47 +04:00
int now ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
now = - addr & ~ PAGE_MASK ;
2011-07-13 10:32:31 +04:00
rc = emulator_read_write_onepage ( addr , val , now , exception ,
vcpu , ops ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
if ( rc ! = X86EMUL_CONTINUE )
return rc ;
addr + = now ;
val + = now ;
bytes - = now ;
}
2011-07-13 10:32:31 +04:00
2012-04-18 20:22:47 +04:00
rc = emulator_read_write_onepage ( addr , val , bytes , exception ,
vcpu , ops ) ;
if ( rc ! = X86EMUL_CONTINUE )
return rc ;
if ( ! vcpu - > mmio_nr_fragments )
return rc ;
gpa = vcpu - > mmio_fragments [ 0 ] . gpa ;
vcpu - > mmio_needed = 1 ;
vcpu - > mmio_cur_fragment = 0 ;
vcpu - > run - > mmio . len = vcpu - > mmio_fragments [ 0 ] . len ;
vcpu - > run - > mmio . is_write = vcpu - > mmio_is_write = ops - > write ;
vcpu - > run - > exit_reason = KVM_EXIT_MMIO ;
vcpu - > run - > mmio . phys_addr = gpa ;
return ops - > read_write_exit_mmio ( vcpu , gpa , val , bytes ) ;
2011-07-13 10:32:31 +04:00
}
static int emulator_read_emulated ( struct x86_emulate_ctxt * ctxt ,
unsigned long addr ,
void * val ,
unsigned int bytes ,
struct x86_exception * exception )
{
return emulator_read_write ( ctxt , addr , val , bytes ,
exception , & read_emultor ) ;
}
int emulator_write_emulated ( struct x86_emulate_ctxt * ctxt ,
unsigned long addr ,
const void * val ,
unsigned int bytes ,
struct x86_exception * exception )
{
return emulator_read_write ( ctxt , addr , ( void * ) val , bytes ,
exception , & write_emultor ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
2010-03-15 14:59:54 +03:00
# define CMPXCHG_TYPE(t, ptr, old, new) \
( cmpxchg ( ( t * ) ( ptr ) , * ( t * ) ( old ) , * ( t * ) ( new ) ) = = * ( t * ) ( old ) )
# ifdef CONFIG_X86_64
# define CMPXCHG64(ptr, old, new) CMPXCHG_TYPE(u64, ptr, old, new)
# else
# define CMPXCHG64(ptr, old, new) \
2010-03-20 12:14:13 +03:00
( cmpxchg64 ( ( u64 * ) ( ptr ) , * ( u64 * ) ( old ) , * ( u64 * ) ( new ) ) = = * ( u64 * ) ( old ) )
2010-03-15 14:59:54 +03:00
# endif
2011-04-20 14:37:53 +04:00
static int emulator_cmpxchg_emulated ( struct x86_emulate_ctxt * ctxt ,
unsigned long addr ,
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
const void * old ,
const void * new ,
unsigned int bytes ,
2011-04-20 14:37:53 +04:00
struct x86_exception * exception )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
2011-04-20 14:37:53 +04:00
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
2010-03-15 14:59:54 +03:00
gpa_t gpa ;
struct page * page ;
char * kaddr ;
bool exchanged ;
2007-12-12 18:46:12 +03:00
2010-03-15 14:59:54 +03:00
/* guests cmpxchg8b have to be emulated atomically */
if ( bytes > 8 | | ( bytes & ( bytes - 1 ) ) )
goto emul_write ;
2007-12-21 03:18:22 +03:00
2010-03-15 14:59:54 +03:00
gpa = kvm_mmu_gva_to_gpa_write ( vcpu , addr , NULL ) ;
2007-12-12 18:46:12 +03:00
2010-03-15 14:59:54 +03:00
if ( gpa = = UNMAPPED_GVA | |
( gpa & PAGE_MASK ) = = APIC_DEFAULT_PHYS_BASE )
goto emul_write ;
2007-12-12 18:46:12 +03:00
2010-03-15 14:59:54 +03:00
if ( ( ( gpa + bytes - 1 ) & PAGE_MASK ) ! = ( gpa & PAGE_MASK ) )
goto emul_write ;
2008-02-10 19:04:15 +03:00
2010-03-15 14:59:54 +03:00
page = gfn_to_page ( vcpu - > kvm , gpa > > PAGE_SHIFT ) ;
2010-07-15 04:51:58 +04:00
if ( is_error_page ( page ) ) {
kvm_release_page_clean ( page ) ;
goto emul_write ;
}
2008-02-10 19:04:15 +03:00
2011-11-25 19:14:17 +04:00
kaddr = kmap_atomic ( page ) ;
2010-03-15 14:59:54 +03:00
kaddr + = offset_in_page ( gpa ) ;
switch ( bytes ) {
case 1 :
exchanged = CMPXCHG_TYPE ( u8 , kaddr , old , new ) ;
break ;
case 2 :
exchanged = CMPXCHG_TYPE ( u16 , kaddr , old , new ) ;
break ;
case 4 :
exchanged = CMPXCHG_TYPE ( u32 , kaddr , old , new ) ;
break ;
case 8 :
exchanged = CMPXCHG64 ( kaddr , old , new ) ;
break ;
default :
BUG ( ) ;
2007-12-12 18:46:12 +03:00
}
2011-11-25 19:14:17 +04:00
kunmap_atomic ( kaddr ) ;
2010-03-15 14:59:54 +03:00
kvm_release_page_dirty ( page ) ;
if ( ! exchanged )
return X86EMUL_CMPXCHG_FAILED ;
2011-09-22 12:56:39 +04:00
kvm_mmu_pte_write ( vcpu , gpa , new , bytes ) ;
2010-04-13 11:21:56 +04:00
return X86EMUL_CONTINUE ;
2010-03-15 14:59:55 +03:00
2008-03-30 03:17:59 +04:00
emul_write :
2010-03-15 14:59:54 +03:00
printk_once ( KERN_WARNING " kvm: emulating exchange as write \n " ) ;
2007-12-12 18:46:12 +03:00
2011-04-20 14:37:53 +04:00
return emulator_write_emulated ( ctxt , addr , new , bytes , exception ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
2010-03-18 16:20:23 +03:00
static int kernel_pio ( struct kvm_vcpu * vcpu , void * pd )
{
/* TODO: String I/O for in kernel device */
int r ;
if ( vcpu - > arch . pio . in )
r = kvm_io_bus_read ( vcpu - > kvm , KVM_PIO_BUS , vcpu - > arch . pio . port ,
vcpu - > arch . pio . size , pd ) ;
else
r = kvm_io_bus_write ( vcpu - > kvm , KVM_PIO_BUS ,
vcpu - > arch . pio . port , vcpu - > arch . pio . size ,
pd ) ;
return r ;
}
2011-09-22 12:55:10 +04:00
static int emulator_pio_in_out ( struct kvm_vcpu * vcpu , int size ,
unsigned short port , void * val ,
unsigned int count , bool in )
2010-03-18 16:20:23 +03:00
{
2011-09-22 12:55:10 +04:00
trace_kvm_pio ( ! in , port , size , count ) ;
2010-03-18 16:20:23 +03:00
vcpu - > arch . pio . port = port ;
2011-09-22 12:55:10 +04:00
vcpu - > arch . pio . in = in ;
2010-03-18 16:20:24 +03:00
vcpu - > arch . pio . count = count ;
2010-03-18 16:20:23 +03:00
vcpu - > arch . pio . size = size ;
if ( ! kernel_pio ( vcpu , vcpu - > arch . pio_data ) ) {
2010-03-18 16:20:24 +03:00
vcpu - > arch . pio . count = 0 ;
2010-03-18 16:20:23 +03:00
return 1 ;
}
vcpu - > run - > exit_reason = KVM_EXIT_IO ;
2011-09-22 12:55:10 +04:00
vcpu - > run - > io . direction = in ? KVM_EXIT_IO_IN : KVM_EXIT_IO_OUT ;
2010-03-18 16:20:23 +03:00
vcpu - > run - > io . size = size ;
vcpu - > run - > io . data_offset = KVM_PIO_PAGE_OFFSET * PAGE_SIZE ;
vcpu - > run - > io . count = count ;
vcpu - > run - > io . port = port ;
return 0 ;
}
2011-09-22 12:55:10 +04:00
static int emulator_pio_in_emulated ( struct x86_emulate_ctxt * ctxt ,
int size , unsigned short port , void * val ,
unsigned int count )
2010-03-18 16:20:23 +03:00
{
2011-04-20 14:37:53 +04:00
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
2011-09-22 12:55:10 +04:00
int ret ;
2011-04-20 14:37:53 +04:00
2011-09-22 12:55:10 +04:00
if ( vcpu - > arch . pio . count )
goto data_avail ;
2010-03-18 16:20:23 +03:00
2011-09-22 12:55:10 +04:00
ret = emulator_pio_in_out ( vcpu , size , port , val , count , true ) ;
if ( ret ) {
data_avail :
memcpy ( val , vcpu - > arch . pio_data , size * count ) ;
2010-03-18 16:20:24 +03:00
vcpu - > arch . pio . count = 0 ;
2010-03-18 16:20:23 +03:00
return 1 ;
}
return 0 ;
}
2011-09-22 12:55:10 +04:00
static int emulator_pio_out_emulated ( struct x86_emulate_ctxt * ctxt ,
int size , unsigned short port ,
const void * val , unsigned int count )
{
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
memcpy ( vcpu - > arch . pio_data , val , size * count ) ;
return emulator_pio_in_out ( vcpu , size , port , ( void * ) val , count , false ) ;
}
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
static unsigned long get_segment_base ( struct kvm_vcpu * vcpu , int seg )
{
return kvm_x86_ops - > get_segment_base ( vcpu , seg ) ;
}
2011-04-20 16:38:44 +04:00
static void emulator_invlpg ( struct x86_emulate_ctxt * ctxt , ulong address )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
2011-04-20 16:38:44 +04:00
kvm_mmu_invlpg ( emul_to_vcpu ( ctxt ) , address ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
2010-06-30 08:25:15 +04:00
int kvm_emulate_wbinvd ( struct kvm_vcpu * vcpu )
{
if ( ! need_emulate_wbinvd ( vcpu ) )
return X86EMUL_CONTINUE ;
if ( kvm_x86_ops - > has_wbinvd_exit ( ) ) {
2010-11-01 16:01:29 +03:00
int cpu = get_cpu ( ) ;
cpumask_set_cpu ( cpu , vcpu - > arch . wbinvd_dirty_mask ) ;
2010-06-30 08:25:15 +04:00
smp_call_function_many ( vcpu - > arch . wbinvd_dirty_mask ,
wbinvd_ipi , NULL , 1 ) ;
2010-11-01 16:01:29 +03:00
put_cpu ( ) ;
2010-06-30 08:25:15 +04:00
cpumask_clear ( vcpu - > arch . wbinvd_dirty_mask ) ;
2010-11-01 16:01:29 +03:00
} else
wbinvd ( ) ;
2010-06-30 08:25:15 +04:00
return X86EMUL_CONTINUE ;
}
EXPORT_SYMBOL_GPL ( kvm_emulate_wbinvd ) ;
2011-04-20 16:53:23 +04:00
static void emulator_wbinvd ( struct x86_emulate_ctxt * ctxt )
{
kvm_emulate_wbinvd ( emul_to_vcpu ( ctxt ) ) ;
}
2011-04-20 14:37:53 +04:00
int emulator_get_dr ( struct x86_emulate_ctxt * ctxt , int dr , unsigned long * dest )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
2011-04-20 14:37:53 +04:00
return _kvm_get_dr ( emul_to_vcpu ( ctxt ) , dr , dest ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
2011-04-20 14:37:53 +04:00
int emulator_set_dr ( struct x86_emulate_ctxt * ctxt , int dr , unsigned long value )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
2010-04-28 20:15:32 +04:00
2011-04-20 14:37:53 +04:00
return __kvm_set_dr ( emul_to_vcpu ( ctxt ) , dr , value ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
2010-03-18 16:20:03 +03:00
static u64 mk_cr_64 ( u64 curr_cr , u32 new_val )
2008-06-27 21:58:02 +04:00
{
2010-03-18 16:20:03 +03:00
return ( curr_cr & ~ ( ( 1ULL < < 32 ) - 1 ) ) | new_val ;
2008-06-27 21:58:02 +04:00
}
2011-04-20 14:37:53 +04:00
static unsigned long emulator_get_cr ( struct x86_emulate_ctxt * ctxt , int cr )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
2011-04-20 14:37:53 +04:00
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
2010-03-18 16:20:03 +03:00
unsigned long value ;
switch ( cr ) {
case 0 :
value = kvm_read_cr0 ( vcpu ) ;
break ;
case 2 :
value = vcpu - > arch . cr2 ;
break ;
case 3 :
2010-12-05 18:30:00 +03:00
value = kvm_read_cr3 ( vcpu ) ;
2010-03-18 16:20:03 +03:00
break ;
case 4 :
value = kvm_read_cr4 ( vcpu ) ;
break ;
case 8 :
value = kvm_get_cr8 ( vcpu ) ;
break ;
default :
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
kvm_err ( " %s: unexpected cr %u \n " , __func__ , cr ) ;
2010-03-18 16:20:03 +03:00
return 0 ;
}
return value ;
}
2011-04-20 14:37:53 +04:00
static int emulator_set_cr ( struct x86_emulate_ctxt * ctxt , int cr , ulong val )
2010-03-18 16:20:03 +03:00
{
2011-04-20 14:37:53 +04:00
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
2010-04-28 20:15:31 +04:00
int res = 0 ;
2010-03-18 16:20:03 +03:00
switch ( cr ) {
case 0 :
2010-06-10 18:02:14 +04:00
res = kvm_set_cr0 ( vcpu , mk_cr_64 ( kvm_read_cr0 ( vcpu ) , val ) ) ;
2010-03-18 16:20:03 +03:00
break ;
case 2 :
vcpu - > arch . cr2 = val ;
break ;
case 3 :
2010-06-10 18:02:16 +04:00
res = kvm_set_cr3 ( vcpu , val ) ;
2010-03-18 16:20:03 +03:00
break ;
case 4 :
2010-06-10 18:02:15 +04:00
res = kvm_set_cr4 ( vcpu , mk_cr_64 ( kvm_read_cr4 ( vcpu ) , val ) ) ;
2010-03-18 16:20:03 +03:00
break ;
case 8 :
2010-12-21 13:12:00 +03:00
res = kvm_set_cr8 ( vcpu , val ) ;
2010-03-18 16:20:03 +03:00
break ;
default :
KVM: Cleanup the kvm_print functions and introduce pr_XX wrappers
Introduces a couple of print functions, which are essentially wrappers
around standard printk functions, with a KVM: prefix.
Functions introduced or modified are:
- kvm_err(fmt, ...)
- kvm_info(fmt, ...)
- kvm_debug(fmt, ...)
- kvm_pr_unimpl(fmt, ...)
- pr_unimpl(vcpu, fmt, ...) -> vcpu_unimpl(vcpu, fmt, ...)
Signed-off-by: Christoffer Dall <c.dall@virtualopensystems.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-06-03 22:17:48 +04:00
kvm_err ( " %s: unexpected cr %u \n " , __func__ , cr ) ;
2010-04-28 20:15:31 +04:00
res = - 1 ;
2010-03-18 16:20:03 +03:00
}
2010-04-28 20:15:31 +04:00
return res ;
2010-03-18 16:20:03 +03:00
}
2012-02-08 17:34:41 +04:00
static void emulator_set_rflags ( struct x86_emulate_ctxt * ctxt , ulong val )
{
kvm_set_rflags ( emul_to_vcpu ( ctxt ) , val ) ;
}
2011-04-20 14:37:53 +04:00
static int emulator_get_cpl ( struct x86_emulate_ctxt * ctxt )
2010-03-18 16:20:05 +03:00
{
2011-04-20 14:37:53 +04:00
return kvm_x86_ops - > get_cpl ( emul_to_vcpu ( ctxt ) ) ;
2010-03-18 16:20:05 +03:00
}
2011-04-20 14:37:53 +04:00
static void emulator_get_gdt ( struct x86_emulate_ctxt * ctxt , struct desc_ptr * dt )
2010-03-18 16:20:16 +03:00
{
2011-04-20 14:37:53 +04:00
kvm_x86_ops - > get_gdt ( emul_to_vcpu ( ctxt ) , dt ) ;
2010-03-18 16:20:16 +03:00
}
2011-04-20 14:37:53 +04:00
static void emulator_get_idt ( struct x86_emulate_ctxt * ctxt , struct desc_ptr * dt )
2010-08-04 06:44:24 +04:00
{
2011-04-20 14:37:53 +04:00
kvm_x86_ops - > get_idt ( emul_to_vcpu ( ctxt ) , dt ) ;
2010-08-04 06:44:24 +04:00
}
2011-04-20 16:12:00 +04:00
static void emulator_set_gdt ( struct x86_emulate_ctxt * ctxt , struct desc_ptr * dt )
{
kvm_x86_ops - > set_gdt ( emul_to_vcpu ( ctxt ) , dt ) ;
}
static void emulator_set_idt ( struct x86_emulate_ctxt * ctxt , struct desc_ptr * dt )
{
kvm_x86_ops - > set_idt ( emul_to_vcpu ( ctxt ) , dt ) ;
}
2011-04-20 14:37:53 +04:00
static unsigned long emulator_get_cached_segment_base (
struct x86_emulate_ctxt * ctxt , int seg )
2010-04-28 20:15:29 +04:00
{
2011-04-20 14:37:53 +04:00
return get_segment_base ( emul_to_vcpu ( ctxt ) , seg ) ;
2010-04-28 20:15:29 +04:00
}
2011-04-27 14:20:30 +04:00
static bool emulator_get_segment ( struct x86_emulate_ctxt * ctxt , u16 * selector ,
struct desc_struct * desc , u32 * base3 ,
int seg )
2010-03-18 16:20:16 +03:00
{
struct kvm_segment var ;
2011-04-20 14:37:53 +04:00
kvm_get_segment ( emul_to_vcpu ( ctxt ) , & var , seg ) ;
2011-04-27 14:20:30 +04:00
* selector = var . selector ;
2010-03-18 16:20:16 +03:00
if ( var . unusable )
return false ;
if ( var . g )
var . limit > > = 12 ;
set_desc_limit ( desc , var . limit ) ;
set_desc_base ( desc , ( unsigned long ) var . base ) ;
2011-03-07 15:55:06 +03:00
# ifdef CONFIG_X86_64
if ( base3 )
* base3 = var . base > > 32 ;
# endif
2010-03-18 16:20:16 +03:00
desc - > type = var . type ;
desc - > s = var . s ;
desc - > dpl = var . dpl ;
desc - > p = var . present ;
desc - > avl = var . avl ;
desc - > l = var . l ;
desc - > d = var . db ;
desc - > g = var . g ;
return true ;
}
2011-04-27 14:20:30 +04:00
static void emulator_set_segment ( struct x86_emulate_ctxt * ctxt , u16 selector ,
struct desc_struct * desc , u32 base3 ,
int seg )
2010-03-18 16:20:16 +03:00
{
2011-04-20 14:37:53 +04:00
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
2010-03-18 16:20:16 +03:00
struct kvm_segment var ;
2011-04-27 14:20:30 +04:00
var . selector = selector ;
2010-03-18 16:20:16 +03:00
var . base = get_desc_base ( desc ) ;
2011-03-07 15:55:06 +03:00
# ifdef CONFIG_X86_64
var . base | = ( ( u64 ) base3 ) < < 32 ;
# endif
2010-03-18 16:20:16 +03:00
var . limit = get_desc_limit ( desc ) ;
if ( desc - > g )
var . limit = ( var . limit < < 12 ) | 0xfff ;
var . type = desc - > type ;
var . present = desc - > p ;
var . dpl = desc - > dpl ;
var . db = desc - > d ;
var . s = desc - > s ;
var . l = desc - > l ;
var . g = desc - > g ;
var . avl = desc - > avl ;
var . present = desc - > p ;
var . unusable = ! var . present ;
var . padding = 0 ;
kvm_set_segment ( vcpu , & var , seg ) ;
return ;
}
2011-04-20 14:37:53 +04:00
static int emulator_get_msr ( struct x86_emulate_ctxt * ctxt ,
u32 msr_index , u64 * pdata )
{
return kvm_get_msr ( emul_to_vcpu ( ctxt ) , msr_index , pdata ) ;
}
static int emulator_set_msr ( struct x86_emulate_ctxt * ctxt ,
u32 msr_index , u64 data )
{
return kvm_set_msr ( emul_to_vcpu ( ctxt ) , msr_index , data ) ;
}
2011-11-10 16:57:30 +04:00
static int emulator_read_pmc ( struct x86_emulate_ctxt * ctxt ,
u32 pmc , u64 * pdata )
{
return kvm_pmu_read_pmc ( emul_to_vcpu ( ctxt ) , pmc , pdata ) ;
}
2011-04-20 16:43:05 +04:00
static void emulator_halt ( struct x86_emulate_ctxt * ctxt )
{
emul_to_vcpu ( ctxt ) - > arch . halt_request = 1 ;
}
2011-03-28 18:53:59 +04:00
static void emulator_get_fpu ( struct x86_emulate_ctxt * ctxt )
{
preempt_disable ( ) ;
2011-04-20 16:55:40 +04:00
kvm_load_guest_fpu ( emul_to_vcpu ( ctxt ) ) ;
2011-03-28 18:53:59 +04:00
/*
* CR0 . TS may reference the host fpu state , not the guest fpu state ,
* so it may be clear at this point .
*/
clts ( ) ;
}
static void emulator_put_fpu ( struct x86_emulate_ctxt * ctxt )
{
preempt_enable ( ) ;
}
2011-04-20 14:37:53 +04:00
static int emulator_intercept ( struct x86_emulate_ctxt * ctxt ,
2011-04-04 14:39:27 +04:00
struct x86_instruction_info * info ,
2011-04-04 14:39:22 +04:00
enum x86_intercept_stage stage )
{
2011-04-20 14:37:53 +04:00
return kvm_x86_ops - > check_intercept ( emul_to_vcpu ( ctxt ) , info , stage ) ;
2011-04-04 14:39:22 +04:00
}
2012-06-07 15:10:16 +04:00
static void emulator_get_cpuid ( struct x86_emulate_ctxt * ctxt ,
2012-01-12 19:43:03 +04:00
u32 * eax , u32 * ebx , u32 * ecx , u32 * edx )
{
2012-06-07 15:10:16 +04:00
kvm_cpuid ( emul_to_vcpu ( ctxt ) , eax , ebx , ecx , edx ) ;
2012-01-12 19:43:03 +04:00
}
2008-02-19 21:25:50 +03:00
static struct x86_emulate_ops emulate_ops = {
2010-02-10 15:21:32 +03:00
. read_std = kvm_read_guest_virt_system ,
2010-03-18 16:20:16 +03:00
. write_std = kvm_write_guest_virt_system ,
2010-02-10 15:21:32 +03:00
. fetch = kvm_fetch_guest_virt ,
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
. read_emulated = emulator_read_emulated ,
. write_emulated = emulator_write_emulated ,
. cmpxchg_emulated = emulator_cmpxchg_emulated ,
2011-04-20 16:38:44 +04:00
. invlpg = emulator_invlpg ,
2010-03-18 16:20:23 +03:00
. pio_in_emulated = emulator_pio_in_emulated ,
. pio_out_emulated = emulator_pio_out_emulated ,
2011-04-27 14:20:30 +04:00
. get_segment = emulator_get_segment ,
. set_segment = emulator_set_segment ,
2010-04-28 20:15:29 +04:00
. get_cached_segment_base = emulator_get_cached_segment_base ,
2010-03-18 16:20:16 +03:00
. get_gdt = emulator_get_gdt ,
2010-08-04 06:44:24 +04:00
. get_idt = emulator_get_idt ,
2011-04-20 16:12:00 +04:00
. set_gdt = emulator_set_gdt ,
. set_idt = emulator_set_idt ,
2010-03-18 16:20:03 +03:00
. get_cr = emulator_get_cr ,
. set_cr = emulator_set_cr ,
2012-02-08 17:34:41 +04:00
. set_rflags = emulator_set_rflags ,
2010-03-18 16:20:05 +03:00
. cpl = emulator_get_cpl ,
2010-04-28 20:15:27 +04:00
. get_dr = emulator_get_dr ,
. set_dr = emulator_set_dr ,
2011-04-20 14:37:53 +04:00
. set_msr = emulator_set_msr ,
. get_msr = emulator_get_msr ,
2011-11-10 16:57:30 +04:00
. read_pmc = emulator_read_pmc ,
2011-04-20 16:43:05 +04:00
. halt = emulator_halt ,
2011-04-20 16:53:23 +04:00
. wbinvd = emulator_wbinvd ,
2011-04-20 16:47:13 +04:00
. fix_hypercall = emulator_fix_hypercall ,
2011-03-28 18:53:59 +04:00
. get_fpu = emulator_get_fpu ,
. put_fpu = emulator_put_fpu ,
2011-04-04 14:39:22 +04:00
. intercept = emulator_intercept ,
2012-01-12 19:43:03 +04:00
. get_cpuid = emulator_get_cpuid ,
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
} ;
2008-06-27 21:58:02 +04:00
static void cache_all_regs ( struct kvm_vcpu * vcpu )
{
kvm_register_read ( vcpu , VCPU_REGS_RAX ) ;
kvm_register_read ( vcpu , VCPU_REGS_RSP ) ;
kvm_register_read ( vcpu , VCPU_REGS_RIP ) ;
vcpu - > arch . regs_dirty = ~ 0 ;
}
2010-04-28 20:15:43 +04:00
static void toggle_interruptibility ( struct kvm_vcpu * vcpu , u32 mask )
{
u32 int_shadow = kvm_x86_ops - > get_interrupt_shadow ( vcpu , mask ) ;
/*
* an sti ; sti ; sequence only disable interrupts for the first
* instruction . So , if the last instruction , be it emulated or
* not , left the system with the INT_STI flag enabled , it
* means that the last instruction is an sti . We should not
* leave the flag on in this case . The same goes for mov ss
*/
if ( ! ( int_shadow & mask ) )
kvm_x86_ops - > set_interrupt_shadow ( vcpu , mask ) ;
}
2010-04-28 20:15:44 +04:00
static void inject_emulated_exception ( struct kvm_vcpu * vcpu )
{
struct x86_emulate_ctxt * ctxt = & vcpu - > arch . emulate_ctxt ;
2010-11-22 18:53:21 +03:00
if ( ctxt - > exception . vector = = PF_VECTOR )
2010-11-29 17:12:30 +03:00
kvm_propagate_fault ( vcpu , & ctxt - > exception ) ;
2010-11-22 18:53:21 +03:00
else if ( ctxt - > exception . error_code_valid )
kvm_queue_exception_e ( vcpu , ctxt - > exception . vector ,
ctxt - > exception . error_code ) ;
2010-04-28 20:15:44 +04:00
else
2010-11-22 18:53:21 +03:00
kvm_queue_exception ( vcpu , ctxt - > exception . vector ) ;
2010-04-28 20:15:44 +04:00
}
2011-06-01 16:34:25 +04:00
static void init_decode_cache ( struct x86_emulate_ctxt * ctxt ,
2011-05-25 06:09:38 +04:00
const unsigned long * regs )
{
2011-06-01 16:34:25 +04:00
memset ( & ctxt - > twobyte , 0 ,
( void * ) & ctxt - > regs - ( void * ) & ctxt - > twobyte ) ;
memcpy ( ctxt - > regs , regs , sizeof ( ctxt - > regs ) ) ;
2011-05-25 06:09:38 +04:00
2011-06-01 16:34:25 +04:00
ctxt - > fetch . start = 0 ;
ctxt - > fetch . end = 0 ;
ctxt - > io_read . pos = 0 ;
ctxt - > io_read . end = 0 ;
ctxt - > mem_read . pos = 0 ;
ctxt - > mem_read . end = 0 ;
2011-05-25 06:09:38 +04:00
}
2010-08-16 01:47:01 +04:00
static void init_emulate_ctxt ( struct kvm_vcpu * vcpu )
{
2011-05-25 06:06:16 +04:00
struct x86_emulate_ctxt * ctxt = & vcpu - > arch . emulate_ctxt ;
2010-08-16 01:47:01 +04:00
int cs_db , cs_l ;
2011-04-12 13:36:25 +04:00
/*
* TODO : fix emulate . c to use guest_read / write_register
* instead of direct - > regs accesses , can save hundred cycles
* on Intel for instructions that don ' t read / change RSP , for
* for example .
*/
2010-08-16 01:47:01 +04:00
cache_all_regs ( vcpu ) ;
kvm_x86_ops - > get_cs_db_l_bits ( vcpu , & cs_db , & cs_l ) ;
2011-05-25 06:06:16 +04:00
ctxt - > eflags = kvm_get_rflags ( vcpu ) ;
ctxt - > eip = kvm_rip_read ( vcpu ) ;
ctxt - > mode = ( ! is_protmode ( vcpu ) ) ? X86EMUL_MODE_REAL :
( ctxt - > eflags & X86_EFLAGS_VM ) ? X86EMUL_MODE_VM86 :
cs_l ? X86EMUL_MODE_PROT64 :
cs_db ? X86EMUL_MODE_PROT32 :
X86EMUL_MODE_PROT16 ;
ctxt - > guest_mode = is_guest_mode ( vcpu ) ;
2011-06-01 16:34:25 +04:00
init_decode_cache ( ctxt , vcpu - > arch . regs ) ;
2011-03-31 14:06:41 +04:00
vcpu - > arch . emulate_regs_need_sync_from_vcpu = false ;
2010-08-16 01:47:01 +04:00
}
2011-04-13 18:12:54 +04:00
int kvm_inject_realmode_interrupt ( struct kvm_vcpu * vcpu , int irq , int inc_eip )
2010-09-19 16:34:06 +04:00
{
2011-05-29 16:53:48 +04:00
struct x86_emulate_ctxt * ctxt = & vcpu - > arch . emulate_ctxt ;
2010-09-19 16:34:06 +04:00
int ret ;
init_emulate_ctxt ( vcpu ) ;
2011-06-01 16:34:25 +04:00
ctxt - > op_bytes = 2 ;
ctxt - > ad_bytes = 2 ;
ctxt - > _eip = ctxt - > eip + inc_eip ;
2011-05-29 16:53:48 +04:00
ret = emulate_int_real ( ctxt , irq ) ;
2010-09-19 16:34:06 +04:00
if ( ret ! = X86EMUL_CONTINUE )
return EMULATE_FAIL ;
2011-06-01 16:34:25 +04:00
ctxt - > eip = ctxt - > _eip ;
memcpy ( vcpu - > arch . regs , ctxt - > regs , sizeof ctxt - > regs ) ;
2011-05-29 16:53:48 +04:00
kvm_rip_write ( vcpu , ctxt - > eip ) ;
kvm_set_rflags ( vcpu , ctxt - > eflags ) ;
2010-09-19 16:34:06 +04:00
if ( irq = = NMI_VECTOR )
2011-09-20 14:43:14 +04:00
vcpu - > arch . nmi_pending = 0 ;
2010-09-19 16:34:06 +04:00
else
vcpu - > arch . interrupt . pending = false ;
return EMULATE_DONE ;
}
EXPORT_SYMBOL_GPL ( kvm_inject_realmode_interrupt ) ;
2010-05-10 12:16:56 +04:00
static int handle_emulation_failure ( struct kvm_vcpu * vcpu )
{
2010-11-29 19:51:49 +03:00
int r = EMULATE_DONE ;
2010-05-10 12:16:56 +04:00
+ + vcpu - > stat . insn_emulation_fail ;
trace_kvm_emulate_insn_failed ( vcpu ) ;
2010-11-29 19:51:49 +03:00
if ( ! is_guest_mode ( vcpu ) ) {
vcpu - > run - > exit_reason = KVM_EXIT_INTERNAL_ERROR ;
vcpu - > run - > internal . suberror = KVM_INTERNAL_ERROR_EMULATION ;
vcpu - > run - > internal . ndata = 0 ;
r = EMULATE_FAIL ;
}
2010-05-10 12:16:56 +04:00
kvm_queue_exception ( vcpu , UD_VECTOR ) ;
2010-11-29 19:51:49 +03:00
return r ;
2010-05-10 12:16:56 +04:00
}
2010-07-08 13:41:12 +04:00
static bool reexecute_instruction ( struct kvm_vcpu * vcpu , gva_t gva )
{
gpa_t gpa ;
2010-07-14 20:05:45 +04:00
if ( tdp_enabled )
return false ;
2010-07-08 13:41:12 +04:00
/*
* if emulation was due to access to shadowed page table
* and it failed try to unshadow page and re - entetr the
* guest to let CPU execute the instruction .
*/
if ( kvm_mmu_unprotect_page_virt ( vcpu , gva ) )
return true ;
gpa = kvm_mmu_gva_to_gpa_system ( vcpu , gva , NULL ) ;
if ( gpa = = UNMAPPED_GVA )
return true ; /* let cpu generate fault */
if ( ! kvm_is_error_hva ( gfn_to_hva ( vcpu - > kvm , gpa > > PAGE_SHIFT ) ) )
return true ;
return false ;
}
2011-09-22 13:02:48 +04:00
static bool retry_instruction ( struct x86_emulate_ctxt * ctxt ,
unsigned long cr2 , int emulation_type )
{
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
unsigned long last_retry_eip , last_retry_addr , gpa = cr2 ;
last_retry_eip = vcpu - > arch . last_retry_eip ;
last_retry_addr = vcpu - > arch . last_retry_addr ;
/*
* If the emulation is caused by # PF and it is non - page_table
* writing instruction , it means the VM - EXIT is caused by shadow
* page protected , we can zap the shadow page and retry this
* instruction directly .
*
* Note : if the guest uses a non - page - table modifying instruction
* on the PDE that points to the instruction , then we will unmap
* the instruction and go to an infinite loop . So , we cache the
* last retried eip and the last fault address , if we meet the eip
* and the address again , we can break out of the potential infinite
* loop .
*/
vcpu - > arch . last_retry_eip = vcpu - > arch . last_retry_addr = 0 ;
if ( ! ( emulation_type & EMULTYPE_RETRY ) )
return false ;
if ( x86_page_table_writing_insn ( ctxt ) )
return false ;
if ( ctxt - > eip = = last_retry_eip & & last_retry_addr = = cr2 )
return false ;
vcpu - > arch . last_retry_eip = ctxt - > eip ;
vcpu - > arch . last_retry_addr = cr2 ;
if ( ! vcpu - > arch . mmu . direct_map )
gpa = kvm_mmu_gva_to_gpa_write ( vcpu , cr2 , NULL ) ;
kvm_mmu_unprotect_page ( vcpu - > kvm , gpa > > PAGE_SHIFT ) ;
return true ;
}
2010-12-21 13:12:02 +03:00
int x86_emulate_instruction ( struct kvm_vcpu * vcpu ,
unsigned long cr2 ,
2010-12-21 13:12:07 +03:00
int emulation_type ,
void * insn ,
int insn_len )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
{
2010-04-28 20:15:43 +04:00
int r ;
2011-05-29 16:53:48 +04:00
struct x86_emulate_ctxt * ctxt = & vcpu - > arch . emulate_ctxt ;
2011-03-31 14:06:41 +04:00
bool writeback = true ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
2008-07-03 15:59:22 +04:00
kvm_clear_exception_queue ( vcpu ) ;
2011-04-12 13:36:21 +04:00
KVM: x86 emulator: Only allow VMCALL/VMMCALL trapped by #UD
When executing a test program called "crashme", we found the KVM guest cannot
survive more than ten seconds, then encounterd kernel panic. The basic concept
of "crashme" is generating random assembly code and trying to execute it.
After some fixes on emulator insn validity judgment, we found it's hard to
get the current emulator handle the invalid instructions correctly, for the
#UD trap for hypercall patching caused troubles. The problem is, if the opcode
itself was OK, but combination of opcode and modrm_reg was invalid, and one
operand of the opcode was memory (SrcMem or DstMem), the emulator will fetch
the memory operand first rather than checking the validity, and may encounter
an error there. For example, ".byte 0xfe, 0x34, 0xcd" has this problem.
In the patch, we simply check that if the invalid opcode wasn't vmcall/vmmcall,
then return from emulate_instruction() and inject a #UD to guest. With the
patch, the guest had been running for more than 12 hours.
Signed-off-by: Feng (Eric) Liu <eric.e.liu@intel.com>
Signed-off-by: Sheng Yang <sheng.yang@intel.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2008-01-02 09:49:22 +03:00
if ( ! ( emulation_type & EMULTYPE_NO_DECODE ) ) {
2010-08-16 01:47:01 +04:00
init_emulate_ctxt ( vcpu ) ;
2011-05-29 16:53:48 +04:00
ctxt - > interruptibility = 0 ;
ctxt - > have_exception = false ;
ctxt - > perm_ok = false ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
2011-05-29 16:53:48 +04:00
ctxt - > only_vendor_specific_insn
2011-02-01 17:32:04 +03:00
= emulation_type & EMULTYPE_TRAP_UD ;
2011-05-29 16:53:48 +04:00
r = x86_decode_insn ( ctxt , insn , insn_len ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
2010-04-11 14:05:16 +04:00
trace_kvm_emulate_insn_start ( vcpu ) ;
2007-11-18 16:17:51 +03:00
+ + vcpu - > stat . insn_emulation ;
2011-07-30 13:03:34 +04:00
if ( r ! = EMULATION_OK ) {
2011-02-01 17:32:04 +03:00
if ( emulation_type & EMULTYPE_TRAP_UD )
return EMULATE_FAIL ;
2010-07-08 13:41:12 +04:00
if ( reexecute_instruction ( vcpu , cr2 ) )
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
return EMULATE_DONE ;
2010-05-10 12:16:56 +04:00
if ( emulation_type & EMULTYPE_SKIP )
return EMULATE_FAIL ;
return handle_emulation_failure ( vcpu ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
}
2009-04-12 14:36:57 +04:00
if ( emulation_type & EMULTYPE_SKIP ) {
2011-06-01 16:34:25 +04:00
kvm_rip_write ( vcpu , ctxt - > _eip ) ;
2009-04-12 14:36:57 +04:00
return EMULATE_DONE ;
}
2011-09-22 13:02:48 +04:00
if ( retry_instruction ( ctxt , cr2 , emulation_type ) )
return EMULATE_DONE ;
2011-03-31 14:06:41 +04:00
/* this is needed for vmware backdoor interface to work since it
2010-04-28 20:15:42 +04:00
changes registers values during IO operation */
2011-03-31 14:06:41 +04:00
if ( vcpu - > arch . emulate_regs_need_sync_from_vcpu ) {
vcpu - > arch . emulate_regs_need_sync_from_vcpu = false ;
2011-06-01 16:34:25 +04:00
memcpy ( ctxt - > regs , vcpu - > arch . regs , sizeof ctxt - > regs ) ;
2011-03-31 14:06:41 +04:00
}
2010-04-28 20:15:42 +04:00
2010-03-18 16:20:26 +03:00
restart :
2011-05-29 16:53:48 +04:00
r = x86_emulate_insn ( ctxt ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
2011-04-04 14:39:24 +04:00
if ( r = = EMULATION_INTERCEPTED )
return EMULATE_DONE ;
2010-08-25 13:47:43 +04:00
if ( r = = EMULATION_FAILED ) {
2010-07-08 13:41:12 +04:00
if ( reexecute_instruction ( vcpu , cr2 ) )
2010-04-28 20:15:35 +04:00
return EMULATE_DONE ;
2010-05-10 12:16:56 +04:00
return handle_emulation_failure ( vcpu ) ;
KVM: Portability: Move x86 emulation and mmio device hook to x86.c
This patch moves the following functions to from kvm_main.c to x86.c:
emulator_read/write_std, vcpu_find_pervcpu_dev, vcpu_find_mmio_dev,
emulator_read/write_emulated, emulator_write_phys,
emulator_write_emulated_onepage, emulator_cmpxchg_emulated,
get_setment_base, emulate_invlpg, emulate_clts, emulator_get/set_dr,
kvm_report_emulation_failure, emulate_instruction
The following data type is moved to x86.c:
struct x86_emulate_ops emulate_ops
Signed-off-by: Carsten Otte <cotte@de.ibm.com>
Acked-by: Hollis Blanchard <hollisb@us.ibm.com>
Signed-off-by: Avi Kivity <avi@qumranet.com>
2007-10-30 20:44:21 +03:00
}
2011-05-29 16:53:48 +04:00
if ( ctxt - > have_exception ) {
2010-04-28 20:15:44 +04:00
inject_emulated_exception ( vcpu ) ;
2010-08-25 13:47:43 +04:00
r = EMULATE_DONE ;
} else if ( vcpu - > arch . pio . count ) {
2010-04-28 20:15:38 +04:00
if ( ! vcpu - > arch . pio . in )
vcpu - > arch . pio . count = 0 ;
2011-03-31 14:06:41 +04:00
else
writeback = false ;
2010-07-29 16:11:52 +04:00
r = EMULATE_DO_MMIO ;
2011-03-31 14:06:41 +04:00
} else if ( vcpu - > mmio_needed ) {
if ( ! vcpu - > mmio_is_write )
writeback = false ;
2010-07-29 16:11:52 +04:00
r = EMULATE_DO_MMIO ;
2011-03-31 14:06:41 +04:00
} else if ( r = = EMULATION_RESTART )
2010-03-18 16:20:26 +03:00
goto restart ;
2010-08-25 13:47:43 +04:00
else
r = EMULATE_DONE ;
2010-02-10 15:21:33 +03:00
2011-03-31 14:06:41 +04:00
if ( writeback ) {
2011-05-29 16:53:48 +04:00
toggle_interruptibility ( vcpu , ctxt - > interruptibility ) ;
kvm_set_rflags ( vcpu , ctxt - > eflags ) ;
2011-03-31 14:06:41 +04:00
kvm_make_request ( KVM_REQ_EVENT , vcpu ) ;
2011-06-01 16:34:25 +04:00
memcpy ( vcpu - > arch . regs , ctxt - > regs , sizeof ctxt - > regs ) ;
2011-03-31 14:06:41 +04:00
vcpu - > arch . emulate_regs_need_sync_to_vcpu = false ;
2011-05-29 16:53:48 +04:00
kvm_rip_write ( vcpu , ctxt - > eip ) ;
2011-03-31 14:06:41 +04:00
} else
vcpu - > arch . emulate_regs_need_sync_to_vcpu = true ;
2010-07-29 16:11:52 +04:00
return r ;
2007-10-30 20:44:25 +03:00
}
2010-12-21 13:12:02 +03:00
EXPORT_SYMBOL_GPL ( x86_emulate_instruction ) ;
2007-10-30 20:44:25 +03:00
2010-03-18 16:20:23 +03:00
int kvm_fast_pio_out ( struct kvm_vcpu * vcpu , int size , unsigned short port )
2007-10-30 20:44:25 +03:00
{
2010-03-18 16:20:23 +03:00
unsigned long val = kvm_register_read ( vcpu , VCPU_REGS_RAX ) ;
2011-04-20 14:37:53 +04:00
int ret = emulator_pio_out_emulated ( & vcpu - > arch . emulate_ctxt ,
size , port , & val , 1 ) ;
2010-03-18 16:20:23 +03:00
/* do not return to emulator after return from userspace */
2010-03-18 16:20:24 +03:00
vcpu - > arch . pio . count = 0 ;
2007-10-30 20:44:25 +03:00
return ret ;
}
2010-03-18 16:20:23 +03:00
EXPORT_SYMBOL_GPL ( kvm_fast_pio_out ) ;
2007-10-30 20:44:25 +03:00
2010-08-20 12:07:21 +04:00
static void tsc_bad ( void * info )
{
2010-12-18 18:28:55 +03:00
__this_cpu_write ( cpu_tsc_khz , 0 ) ;
2010-08-20 12:07:21 +04:00
}
static void tsc_khz_changed ( void * data )
2009-02-04 19:52:04 +03:00
{
2010-08-20 12:07:21 +04:00
struct cpufreq_freqs * freq = data ;
unsigned long khz = 0 ;
if ( data )
khz = freq - > new ;
else if ( ! boot_cpu_has ( X86_FEATURE_CONSTANT_TSC ) )
khz = cpufreq_quick_get ( raw_smp_processor_id ( ) ) ;
if ( ! khz )
khz = tsc_khz ;
2010-12-18 18:28:55 +03:00
__this_cpu_write ( cpu_tsc_khz , khz ) ;
2009-02-04 19:52:04 +03:00
}
static int kvmclock_cpufreq_notifier ( struct notifier_block * nb , unsigned long val ,
void * data )
{
struct cpufreq_freqs * freq = data ;
struct kvm * kvm ;
struct kvm_vcpu * vcpu ;
int i , send_ipi = 0 ;
2010-08-20 12:07:21 +04:00
/*
* We allow guests to temporarily run on slowing clocks ,
* provided we notify them after , or to run on accelerating
* clocks , provided we notify them before . Thus time never
* goes backwards .
*
* However , we have a problem . We can ' t atomically update
* the frequency of a given CPU from this function ; it is
* merely a notifier , which can be called from any CPU .
* Changing the TSC frequency at arbitrary points in time
* requires a recomputation of local variables related to
* the TSC for each VCPU . We must flag these local variables
* to be updated and be sure the update takes place with the
* new frequency before any guests proceed .
*
* Unfortunately , the combination of hotplug CPU and frequency
* change creates an intractable locking scenario ; the order
* of when these callouts happen is undefined with respect to
* CPU hotplug , and they can race with each other . As such ,
* merely setting per_cpu ( cpu_tsc_khz ) = X during a hotadd is
* undefined ; you can actually have a CPU frequency change take
* place in between the computation of X and the setting of the
* variable . To protect against this problem , all updates of
* the per_cpu tsc_khz variable are done in an interrupt
* protected IPI , and all callers wishing to update the value
* must wait for a synchronous IPI to complete ( which is trivial
* if the caller is on the CPU already ) . This establishes the
* necessary total order on variable updates .
*
* Note that because a guest time update may take place
* anytime after the setting of the VCPU ' s request bit , the
* correct TSC value must be set before the request . However ,
* to ensure the update actually makes it to any guest which
* starts running in hardware virtualization between the set
* and the acquisition of the spinlock , we must also ping the
* CPU after setting the request bit .
*
*/
2009-02-04 19:52:04 +03:00
if ( val = = CPUFREQ_PRECHANGE & & freq - > old > freq - > new )
return 0 ;
if ( val = = CPUFREQ_POSTCHANGE & & freq - > old < freq - > new )
return 0 ;
2010-08-20 12:07:21 +04:00
smp_call_function_single ( freq - > cpu , tsc_khz_changed , freq , 1 ) ;
2009-02-04 19:52:04 +03:00
2011-02-08 14:55:33 +03:00
raw_spin_lock ( & kvm_lock ) ;
2009-02-04 19:52:04 +03:00
list_for_each_entry ( kvm , & vm_list , vm_list ) {
2009-06-09 16:56:29 +04:00
kvm_for_each_vcpu ( i , vcpu , kvm ) {
2009-02-04 19:52:04 +03:00
if ( vcpu - > cpu ! = freq - > cpu )
continue ;
2010-09-19 04:38:15 +04:00
kvm_make_request ( KVM_REQ_CLOCK_UPDATE , vcpu ) ;
2009-02-04 19:52:04 +03:00
if ( vcpu - > cpu ! = smp_processor_id ( ) )
2010-08-20 12:07:21 +04:00
send_ipi = 1 ;
2009-02-04 19:52:04 +03:00
}
}
2011-02-08 14:55:33 +03:00
raw_spin_unlock ( & kvm_lock ) ;
2009-02-04 19:52:04 +03:00
if ( freq - > old < freq - > new & & send_ipi ) {
/*
* We upscale the frequency . Must make the guest
* doesn ' t see old kvmclock values while running with
* the new frequency , otherwise we risk the guest sees
* time go backwards .
*
* In case we update the frequency for another cpu
* ( which might be in guest context ) send an interrupt
* to kick the cpu out of guest context . Next time
* guest context is entered kvmclock will be updated ,
* so the guest will not see stale values .
*/
2010-08-20 12:07:21 +04:00
smp_call_function_single ( freq - > cpu , tsc_khz_changed , freq , 1 ) ;
2009-02-04 19:52:04 +03:00
}
return 0 ;
}
static struct notifier_block kvmclock_cpufreq_notifier_block = {
2010-08-20 12:07:21 +04:00
. notifier_call = kvmclock_cpufreq_notifier
} ;
static int kvmclock_cpu_notifier ( struct notifier_block * nfb ,
unsigned long action , void * hcpu )
{
unsigned int cpu = ( unsigned long ) hcpu ;
switch ( action ) {
case CPU_ONLINE :
case CPU_DOWN_FAILED :
smp_call_function_single ( cpu , tsc_khz_changed , NULL , 1 ) ;
break ;
case CPU_DOWN_PREPARE :
smp_call_function_single ( cpu , tsc_bad , NULL , 1 ) ;
break ;
}
return NOTIFY_OK ;
}
static struct notifier_block kvmclock_cpu_notifier_block = {
. notifier_call = kvmclock_cpu_notifier ,
. priority = - INT_MAX
2009-02-04 19:52:04 +03:00
} ;
2009-09-30 01:38:34 +04:00
static void kvm_timer_init ( void )
{
int cpu ;
2010-09-19 04:38:15 +04:00
max_tsc_khz = tsc_khz ;
2010-08-20 12:07:21 +04:00
register_hotcpu_notifier ( & kvmclock_cpu_notifier_block ) ;
2009-09-30 01:38:34 +04:00
if ( ! boot_cpu_has ( X86_FEATURE_CONSTANT_TSC ) ) {
2010-09-19 04:38:15 +04:00
# ifdef CONFIG_CPU_FREQ
struct cpufreq_policy policy ;
memset ( & policy , 0 , sizeof ( policy ) ) ;
2010-12-16 13:16:34 +03:00
cpu = get_cpu ( ) ;
cpufreq_get_policy ( & policy , cpu ) ;
2010-09-19 04:38:15 +04:00
if ( policy . cpuinfo . max_freq )
max_tsc_khz = policy . cpuinfo . max_freq ;
2010-12-16 13:16:34 +03:00
put_cpu ( ) ;
2010-09-19 04:38:15 +04:00
# endif
2009-09-30 01:38:34 +04:00
cpufreq_register_notifier ( & kvmclock_cpufreq_notifier_block ,
CPUFREQ_TRANSITION_NOTIFIER ) ;
}
2010-09-19 04:38:15 +04:00
pr_debug ( " kvm: max_tsc_khz = %ld \n " , max_tsc_khz ) ;
2010-08-20 12:07:21 +04:00
for_each_online_cpu ( cpu )
smp_call_function_single ( cpu , tsc_khz_changed , NULL , 1 ) ;
2009-09-30 01:38:34 +04:00
}
2010-04-19 09:32:45 +04:00
static DEFINE_PER_CPU ( struct kvm_vcpu * , current_vcpu ) ;
2011-11-10 16:57:22 +04:00
int kvm_is_in_guest ( void )
2010-04-19 09:32:45 +04:00
{
2011-10-20 11:34:01 +04:00
return __this_cpu_read ( current_vcpu ) ! = NULL ;
2010-04-19 09:32:45 +04:00
}
static int kvm_is_user_mode ( void )
{
int user_mode = 3 ;
2010-04-20 06:13:58 +04:00
2011-10-20 11:34:01 +04:00
if ( __this_cpu_read ( current_vcpu ) )
user_mode = kvm_x86_ops - > get_cpl ( __this_cpu_read ( current_vcpu ) ) ;
2010-04-20 06:13:58 +04:00
2010-04-19 09:32:45 +04:00
return user_mode ! = 0 ;
}
static unsigned long kvm_get_guest_ip ( void )
{
unsigned long ip = 0 ;
2010-04-20 06:13:58 +04:00
2011-10-20 11:34:01 +04:00
if ( __this_cpu_read ( current_vcpu ) )
ip = kvm_rip_read ( __this_cpu_read ( current_vcpu ) ) ;
2010-04-20 06:13:58 +04:00
2010-04-19 09:32:45 +04:00
return ip ;
}
static struct perf_guest_info_callbacks kvm_guest_cbs = {
. is_in_guest = kvm_is_in_guest ,
. is_user_mode = kvm_is_user_mode ,
. get_guest_ip = kvm_get_guest_ip ,
} ;
void kvm_before_handle_nmi ( struct kvm_vcpu * vcpu )
{
2011-10-20 11:34:01 +04:00
__this_cpu_write ( current_vcpu , vcpu ) ;
2010-04-19 09:32:45 +04:00
}
EXPORT_SYMBOL_GPL ( kvm_before_handle_nmi ) ;
void kvm_after_handle_nmi ( struct kvm_vcpu * vcpu )
{
2011-10-20 11:34:01 +04:00
__this_cpu_write ( current_vcpu , NULL ) ;
2010-04-19 09:32:45 +04:00
}
EXPORT_SYMBOL_GPL ( kvm_after_handle_nmi ) ;
2011-07-11 23:33:44 +04:00
static void kvm_set_mmio_spte_mask ( void )
{
u64 mask ;
int maxphyaddr = boot_cpu_data . x86_phys_bits ;
/*
* Set the reserved bits and the present bit of an paging - structure
* entry to generate page fault with PFER . RSV = 1.
*/
mask = ( ( 1ull < < ( 62 - maxphyaddr + 1 ) ) - 1 ) < < maxphyaddr ;
mask | = 1ull ;
# ifdef CONFIG_X86_64
/*
* If reserved bit is not supported , clear the present bit to disable
* mmio page fault .
*/
if ( maxphyaddr = = 52 )
mask & = ~ 1ull ;
# endif
kvm_mmu_set_mmio_spte_mask ( mask ) ;
}
2007-11-14 15:40:21 +03:00
int kvm_arch_init ( void * opaque )
2007-10-10 19:16:19 +04:00
{
2009-09-30 01:38:34 +04:00
int r ;
2007-11-14 15:40:21 +03:00
struct kvm_x86_ops * ops = ( struct kvm_x86_ops * ) opaque ;
if ( kvm_x86_ops ) {
printk ( KERN_ERR " kvm: already loaded the other module \n " ) ;
2007-11-18 15:43:21 +03:00
r = - EEXIST ;
goto out ;
2007-11-14 15:40:21 +03:00
}
if ( ! ops - > cpu_has_kvm_support ( ) ) {
printk ( KERN_ERR " kvm: no hardware support \n " ) ;
2007-11-18 15:43:21 +03:00
r = - EOPNOTSUPP ;
goto out ;
2007-11-14 15:40:21 +03:00
}
if ( ops - > disabled_by_bios ( ) ) {
printk ( KERN_ERR " kvm: disabled by bios \n " ) ;
2007-11-18 15:43:21 +03:00
r = - EOPNOTSUPP ;
goto out ;
2007-11-14 15:40:21 +03:00
}
2008-01-13 14:23:56 +03:00
r = kvm_mmu_module_init ( ) ;
if ( r )
goto out ;
2011-07-11 23:33:44 +04:00
kvm_set_mmio_spte_mask ( ) ;
2008-01-13 14:23:56 +03:00
kvm_init_msr_list ( ) ;
2007-11-14 15:40:21 +03:00
kvm_x86_ops = ops ;
2008-04-25 17:13:50 +04:00
kvm_mmu_set_mask_ptes ( PT_USER_MASK , PT_ACCESSED_MASK ,
2009-04-27 16:35:42 +04:00
PT_DIRTY_MASK , PT64_NX_MASK , 0 ) ;
2009-02-04 19:52:04 +03:00
2009-09-30 01:38:34 +04:00
kvm_timer_init ( ) ;
2009-02-04 19:52:04 +03:00
2010-04-19 09:32:45 +04:00
perf_register_guest_info_callbacks ( & kvm_guest_cbs ) ;
2010-06-10 07:27:12 +04:00
if ( cpu_has_xsave )
host_xcr0 = xgetbv ( XCR_XFEATURE_ENABLED_MASK ) ;
2007-11-14 15:40:21 +03:00
return 0 ;
2007-11-18 15:43:21 +03:00
out :
return r ;
2007-10-10 19:16:19 +04:00
}
2007-11-01 01:24:24 +03:00
2007-11-14 15:40:21 +03:00
void kvm_arch_exit ( void )
{
2010-04-19 09:32:45 +04:00
perf_unregister_guest_info_callbacks ( & kvm_guest_cbs ) ;
2009-04-17 21:24:58 +04:00
if ( ! boot_cpu_has ( X86_FEATURE_CONSTANT_TSC ) )
cpufreq_unregister_notifier ( & kvmclock_cpufreq_notifier_block ,
CPUFREQ_TRANSITION_NOTIFIER ) ;
2010-08-20 12:07:21 +04:00
unregister_hotcpu_notifier ( & kvmclock_cpu_notifier_block ) ;
2007-11-14 15:40:21 +03:00
kvm_x86_ops = NULL ;
2007-11-18 15:43:21 +03:00
kvm_mmu_module_exit ( ) ;
}
2007-11-14 15:40:21 +03:00
2007-11-01 01:24:24 +03:00
int kvm_emulate_halt ( struct kvm_vcpu * vcpu )
{
+ + vcpu - > stat . halt_exits ;
if ( irqchip_in_kernel ( vcpu - > kvm ) ) {
2008-04-13 18:54:35 +04:00
vcpu - > arch . mp_state = KVM_MP_STATE_HALTED ;
2007-11-01 01:24:24 +03:00
return 1 ;
} else {
vcpu - > run - > exit_reason = KVM_EXIT_HLT ;
return 0 ;
}
}
EXPORT_SYMBOL_GPL ( kvm_emulate_halt ) ;
2010-01-17 16:51:22 +03:00
int kvm_hv_hypercall ( struct kvm_vcpu * vcpu )
{
u64 param , ingpa , outgpa , ret ;
uint16_t code , rep_idx , rep_cnt , res = HV_STATUS_SUCCESS , rep_done = 0 ;
bool fast , longmode ;
int cs_db , cs_l ;
/*
* hypercall generates UD from non zero cpl and real mode
* per HYPER - V spec
*/
2010-01-21 16:31:48 +03:00
if ( kvm_x86_ops - > get_cpl ( vcpu ) ! = 0 | | ! is_protmode ( vcpu ) ) {
2010-01-17 16:51:22 +03:00
kvm_queue_exception ( vcpu , UD_VECTOR ) ;
return 0 ;
}
kvm_x86_ops - > get_cs_db_l_bits ( vcpu , & cs_db , & cs_l ) ;
longmode = is_long_mode ( vcpu ) & & cs_l = = 1 ;
if ( ! longmode ) {
2010-01-19 16:06:38 +03:00
param = ( ( u64 ) kvm_register_read ( vcpu , VCPU_REGS_RDX ) < < 32 ) |
( kvm_register_read ( vcpu , VCPU_REGS_RAX ) & 0xffffffff ) ;
ingpa = ( ( u64 ) kvm_register_read ( vcpu , VCPU_REGS_RBX ) < < 32 ) |
( kvm_register_read ( vcpu , VCPU_REGS_RCX ) & 0xffffffff ) ;
outgpa = ( ( u64 ) kvm_register_read ( vcpu , VCPU_REGS_RDI ) < < 32 ) |
( kvm_register_read ( vcpu , VCPU_REGS_RSI ) & 0xffffffff ) ;
2010-01-17 16:51:22 +03:00
}
# ifdef CONFIG_X86_64
else {
param = kvm_register_read ( vcpu , VCPU_REGS_RCX ) ;
ingpa = kvm_register_read ( vcpu , VCPU_REGS_RDX ) ;
outgpa = kvm_register_read ( vcpu , VCPU_REGS_R8 ) ;
}
# endif
code = param & 0xffff ;
fast = ( param > > 16 ) & 0x1 ;
rep_cnt = ( param > > 32 ) & 0xfff ;
rep_idx = ( param > > 48 ) & 0xfff ;
trace_kvm_hv_hypercall ( code , fast , rep_cnt , rep_idx , ingpa , outgpa ) ;
2010-01-17 16:51:24 +03:00
switch ( code ) {
case HV_X64_HV_NOTIFY_LONG_SPIN_WAIT :
kvm_vcpu_on_spin ( vcpu ) ;
break ;
default :
res = HV_STATUS_INVALID_HYPERCALL_CODE ;
break ;
}
2010-01-17 16:51:22 +03:00
ret = res | ( ( ( u64 ) rep_done & 0xfff ) < < 32 ) ;
if ( longmode ) {
kvm_register_write ( vcpu , VCPU_REGS_RAX , ret ) ;
} else {
kvm_register_write ( vcpu , VCPU_REGS_RDX , ret > > 32 ) ;
kvm_register_write ( vcpu , VCPU_REGS_RAX , ret & 0xffffffff ) ;
}
return 1 ;
}
2007-11-01 01:24:24 +03:00
int kvm_emulate_hypercall ( struct kvm_vcpu * vcpu )
{
unsigned long nr , a0 , a1 , a2 , a3 , ret ;
2008-02-22 20:21:37 +03:00
int r = 1 ;
2007-11-01 01:24:24 +03:00
2010-01-17 16:51:22 +03:00
if ( kvm_hv_hypercall_enabled ( vcpu - > kvm ) )
return kvm_hv_hypercall ( vcpu ) ;
2008-06-27 21:58:02 +04:00
nr = kvm_register_read ( vcpu , VCPU_REGS_RAX ) ;
a0 = kvm_register_read ( vcpu , VCPU_REGS_RBX ) ;
a1 = kvm_register_read ( vcpu , VCPU_REGS_RCX ) ;
a2 = kvm_register_read ( vcpu , VCPU_REGS_RDX ) ;
a3 = kvm_register_read ( vcpu , VCPU_REGS_RSI ) ;
2007-11-01 01:24:24 +03:00
2009-06-17 16:22:14 +04:00
trace_kvm_hypercall ( nr , a0 , a1 , a2 , a3 ) ;
2008-04-10 23:31:10 +04:00
2007-11-01 01:24:24 +03:00
if ( ! is_long_mode ( vcpu ) ) {
nr & = 0xFFFFFFFF ;
a0 & = 0xFFFFFFFF ;
a1 & = 0xFFFFFFFF ;
a2 & = 0xFFFFFFFF ;
a3 & = 0xFFFFFFFF ;
}
2009-08-03 20:43:28 +04:00
if ( kvm_x86_ops - > get_cpl ( vcpu ) ! = 0 ) {
ret = - KVM_EPERM ;
goto out ;
}
2007-11-01 01:24:24 +03:00
switch ( nr ) {
2007-10-25 18:52:32 +04:00
case KVM_HC_VAPIC_POLL_IRQ :
ret = 0 ;
break ;
2007-11-01 01:24:24 +03:00
default :
ret = - KVM_ENOSYS ;
break ;
}
2009-08-03 20:43:28 +04:00
out :
2008-06-27 21:58:02 +04:00
kvm_register_write ( vcpu , VCPU_REGS_RAX , ret ) ;
2008-02-20 22:30:30 +03:00
+ + vcpu - > stat . hypercalls ;
2008-02-22 20:21:37 +03:00
return r ;
2007-11-01 01:24:24 +03:00
}
EXPORT_SYMBOL_GPL ( kvm_emulate_hypercall ) ;
2011-04-20 16:47:13 +04:00
int emulator_fix_hypercall ( struct x86_emulate_ctxt * ctxt )
2007-11-01 01:24:24 +03:00
{
2011-04-20 16:47:13 +04:00
struct kvm_vcpu * vcpu = emul_to_vcpu ( ctxt ) ;
2007-11-01 01:24:24 +03:00
char instruction [ 3 ] ;
2008-06-27 21:58:02 +04:00
unsigned long rip = kvm_rip_read ( vcpu ) ;
2007-11-01 01:24:24 +03:00
/*
* Blow out the MMU to ensure that no other VCPU has an active mapping
* to ensure that the updated hypercall appears atomically across all
* VCPUs .
*/
kvm_mmu_zap_all ( vcpu - > kvm ) ;
kvm_x86_ops - > patch_hypercall ( vcpu , instruction ) ;
2011-05-29 16:53:48 +04:00
return emulator_write_emulated ( ctxt , rip , instruction , 3 , NULL ) ;
2007-11-01 01:24:24 +03:00
}
2007-11-01 22:16:10 +03:00
/*
* Check if userspace requested an interrupt window , and that the
* interrupt window is open .
*
* No need to exit to userspace if we already have an interrupt queued .
*/
2009-08-24 12:10:17 +04:00
static int dm_request_for_irq_injection ( struct kvm_vcpu * vcpu )
2007-11-01 22:16:10 +03:00
{
2009-04-21 18:44:56 +04:00
return ( ! irqchip_in_kernel ( vcpu - > kvm ) & & ! kvm_cpu_has_interrupt ( vcpu ) & &
2009-08-24 12:10:17 +04:00
vcpu - > run - > request_interrupt_window & &
2009-04-21 18:44:59 +04:00
kvm_arch_interrupt_allowed ( vcpu ) ) ;
2007-11-01 22:16:10 +03:00
}
2009-08-24 12:10:17 +04:00
static void post_kvm_run_save ( struct kvm_vcpu * vcpu )
2007-11-01 22:16:10 +03:00
{
2009-08-24 12:10:17 +04:00
struct kvm_run * kvm_run = vcpu - > run ;
2009-10-05 15:07:21 +04:00
kvm_run - > if_flag = ( kvm_get_rflags ( vcpu ) & X86_EFLAGS_IF ) ! = 0 ;
2008-02-24 12:20:43 +03:00
kvm_run - > cr8 = kvm_get_cr8 ( vcpu ) ;
2007-11-01 22:16:10 +03:00
kvm_run - > apic_base = kvm_get_apic_base ( vcpu ) ;
2008-12-11 18:54:54 +03:00
if ( irqchip_in_kernel ( vcpu - > kvm ) )
2007-11-01 22:16:10 +03:00
kvm_run - > ready_for_interrupt_injection = 1 ;
2008-12-11 18:54:54 +03:00
else
2007-11-01 22:16:10 +03:00
kvm_run - > ready_for_interrupt_injection =
2009-05-11 14:35:47 +04:00
kvm_arch_interrupt_allowed ( vcpu ) & &
! kvm_cpu_has_interrupt ( vcpu ) & &
! kvm_event_needs_reinjection ( vcpu ) ;
2007-11-01 22:16:10 +03:00
}
KVM: fix error paths for failed gfn_to_page() calls
This bug was triggered:
[ 4220.198458] BUG: unable to handle kernel paging request at fffffffffffffffe
[ 4220.203907] IP: [<ffffffff81104d85>] put_page+0xf/0x34
......
[ 4220.237326] Call Trace:
[ 4220.237361] [<ffffffffa03830d0>] kvm_arch_destroy_vm+0xf9/0x101 [kvm]
[ 4220.237382] [<ffffffffa036fe53>] kvm_put_kvm+0xcc/0x127 [kvm]
[ 4220.237401] [<ffffffffa03702bc>] kvm_vcpu_release+0x18/0x1c [kvm]
[ 4220.237407] [<ffffffff81145425>] __fput+0x111/0x1ed
[ 4220.237411] [<ffffffff8114550f>] ____fput+0xe/0x10
[ 4220.237418] [<ffffffff81063511>] task_work_run+0x5d/0x88
[ 4220.237424] [<ffffffff8104c3f7>] do_exit+0x2bf/0x7ca
The test case:
printf(fmt, ##args); \
exit(-1);} while (0)
static int create_vm(void)
{
int sys_fd, vm_fd;
sys_fd = open("/dev/kvm", O_RDWR);
if (sys_fd < 0)
die("open /dev/kvm fail.\n");
vm_fd = ioctl(sys_fd, KVM_CREATE_VM, 0);
if (vm_fd < 0)
die("KVM_CREATE_VM fail.\n");
return vm_fd;
}
static int create_vcpu(int vm_fd)
{
int vcpu_fd;
vcpu_fd = ioctl(vm_fd, KVM_CREATE_VCPU, 0);
if (vcpu_fd < 0)
die("KVM_CREATE_VCPU ioctl.\n");
printf("Create vcpu.\n");
return vcpu_fd;
}
static void *vcpu_thread(void *arg)
{
int vm_fd = (int)(long)arg;
create_vcpu(vm_fd);
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t thread;
int vm_fd;
(void)argc;
(void)argv;
vm_fd = create_vm();
pthread_create(&thread, NULL, vcpu_thread, (void *)(long)vm_fd);
printf("Exit.\n");
return 0;
}
It caused by release kvm->arch.ept_identity_map_addr which is the
error page.
The parent thread can send KILL signal to the vcpu thread when it was
exiting which stops faulting pages and potentially allocating memory.
So gfn_to_pfn/gfn_to_page may fail at this time
Fixed by checking the page before it is used
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-07 10:14:20 +04:00
static int vapic_enter ( struct kvm_vcpu * vcpu )
2007-10-25 18:52:32 +04:00
{
struct kvm_lapic * apic = vcpu - > arch . apic ;
struct page * page ;
if ( ! apic | | ! apic - > vapic_addr )
KVM: fix error paths for failed gfn_to_page() calls
This bug was triggered:
[ 4220.198458] BUG: unable to handle kernel paging request at fffffffffffffffe
[ 4220.203907] IP: [<ffffffff81104d85>] put_page+0xf/0x34
......
[ 4220.237326] Call Trace:
[ 4220.237361] [<ffffffffa03830d0>] kvm_arch_destroy_vm+0xf9/0x101 [kvm]
[ 4220.237382] [<ffffffffa036fe53>] kvm_put_kvm+0xcc/0x127 [kvm]
[ 4220.237401] [<ffffffffa03702bc>] kvm_vcpu_release+0x18/0x1c [kvm]
[ 4220.237407] [<ffffffff81145425>] __fput+0x111/0x1ed
[ 4220.237411] [<ffffffff8114550f>] ____fput+0xe/0x10
[ 4220.237418] [<ffffffff81063511>] task_work_run+0x5d/0x88
[ 4220.237424] [<ffffffff8104c3f7>] do_exit+0x2bf/0x7ca
The test case:
printf(fmt, ##args); \
exit(-1);} while (0)
static int create_vm(void)
{
int sys_fd, vm_fd;
sys_fd = open("/dev/kvm", O_RDWR);
if (sys_fd < 0)
die("open /dev/kvm fail.\n");
vm_fd = ioctl(sys_fd, KVM_CREATE_VM, 0);
if (vm_fd < 0)
die("KVM_CREATE_VM fail.\n");
return vm_fd;
}
static int create_vcpu(int vm_fd)
{
int vcpu_fd;
vcpu_fd = ioctl(vm_fd, KVM_CREATE_VCPU, 0);
if (vcpu_fd < 0)
die("KVM_CREATE_VCPU ioctl.\n");
printf("Create vcpu.\n");
return vcpu_fd;
}
static void *vcpu_thread(void *arg)
{
int vm_fd = (int)(long)arg;
create_vcpu(vm_fd);
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t thread;
int vm_fd;
(void)argc;
(void)argv;
vm_fd = create_vm();
pthread_create(&thread, NULL, vcpu_thread, (void *)(long)vm_fd);
printf("Exit.\n");
return 0;
}
It caused by release kvm->arch.ept_identity_map_addr which is the
error page.
The parent thread can send KILL signal to the vcpu thread when it was
exiting which stops faulting pages and potentially allocating memory.
So gfn_to_pfn/gfn_to_page may fail at this time
Fixed by checking the page before it is used
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-07 10:14:20 +04:00
return 0 ;
2007-10-25 18:52:32 +04:00
page = gfn_to_page ( vcpu - > kvm , apic - > vapic_addr > > PAGE_SHIFT ) ;
KVM: fix error paths for failed gfn_to_page() calls
This bug was triggered:
[ 4220.198458] BUG: unable to handle kernel paging request at fffffffffffffffe
[ 4220.203907] IP: [<ffffffff81104d85>] put_page+0xf/0x34
......
[ 4220.237326] Call Trace:
[ 4220.237361] [<ffffffffa03830d0>] kvm_arch_destroy_vm+0xf9/0x101 [kvm]
[ 4220.237382] [<ffffffffa036fe53>] kvm_put_kvm+0xcc/0x127 [kvm]
[ 4220.237401] [<ffffffffa03702bc>] kvm_vcpu_release+0x18/0x1c [kvm]
[ 4220.237407] [<ffffffff81145425>] __fput+0x111/0x1ed
[ 4220.237411] [<ffffffff8114550f>] ____fput+0xe/0x10
[ 4220.237418] [<ffffffff81063511>] task_work_run+0x5d/0x88
[ 4220.237424] [<ffffffff8104c3f7>] do_exit+0x2bf/0x7ca
The test case:
printf(fmt, ##args); \
exit(-1);} while (0)
static int create_vm(void)
{
int sys_fd, vm_fd;
sys_fd = open("/dev/kvm", O_RDWR);
if (sys_fd < 0)
die("open /dev/kvm fail.\n");
vm_fd = ioctl(sys_fd, KVM_CREATE_VM, 0);
if (vm_fd < 0)
die("KVM_CREATE_VM fail.\n");
return vm_fd;
}
static int create_vcpu(int vm_fd)
{
int vcpu_fd;
vcpu_fd = ioctl(vm_fd, KVM_CREATE_VCPU, 0);
if (vcpu_fd < 0)
die("KVM_CREATE_VCPU ioctl.\n");
printf("Create vcpu.\n");
return vcpu_fd;
}
static void *vcpu_thread(void *arg)
{
int vm_fd = (int)(long)arg;
create_vcpu(vm_fd);
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t thread;
int vm_fd;
(void)argc;
(void)argv;
vm_fd = create_vm();
pthread_create(&thread, NULL, vcpu_thread, (void *)(long)vm_fd);
printf("Exit.\n");
return 0;
}
It caused by release kvm->arch.ept_identity_map_addr which is the
error page.
The parent thread can send KILL signal to the vcpu thread when it was
exiting which stops faulting pages and potentially allocating memory.
So gfn_to_pfn/gfn_to_page may fail at this time
Fixed by checking the page before it is used
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-07 10:14:20 +04:00
if ( is_error_page ( page ) )
return - EFAULT ;
2008-02-10 19:04:15 +03:00
vcpu - > arch . apic - > vapic_page = page ;
KVM: fix error paths for failed gfn_to_page() calls
This bug was triggered:
[ 4220.198458] BUG: unable to handle kernel paging request at fffffffffffffffe
[ 4220.203907] IP: [<ffffffff81104d85>] put_page+0xf/0x34
......
[ 4220.237326] Call Trace:
[ 4220.237361] [<ffffffffa03830d0>] kvm_arch_destroy_vm+0xf9/0x101 [kvm]
[ 4220.237382] [<ffffffffa036fe53>] kvm_put_kvm+0xcc/0x127 [kvm]
[ 4220.237401] [<ffffffffa03702bc>] kvm_vcpu_release+0x18/0x1c [kvm]
[ 4220.237407] [<ffffffff81145425>] __fput+0x111/0x1ed
[ 4220.237411] [<ffffffff8114550f>] ____fput+0xe/0x10
[ 4220.237418] [<ffffffff81063511>] task_work_run+0x5d/0x88
[ 4220.237424] [<ffffffff8104c3f7>] do_exit+0x2bf/0x7ca
The test case:
printf(fmt, ##args); \
exit(-1);} while (0)
static int create_vm(void)
{
int sys_fd, vm_fd;
sys_fd = open("/dev/kvm", O_RDWR);
if (sys_fd < 0)
die("open /dev/kvm fail.\n");
vm_fd = ioctl(sys_fd, KVM_CREATE_VM, 0);
if (vm_fd < 0)
die("KVM_CREATE_VM fail.\n");
return vm_fd;
}
static int create_vcpu(int vm_fd)
{
int vcpu_fd;
vcpu_fd = ioctl(vm_fd, KVM_CREATE_VCPU, 0);
if (vcpu_fd < 0)
die("KVM_CREATE_VCPU ioctl.\n");
printf("Create vcpu.\n");
return vcpu_fd;
}
static void *vcpu_thread(void *arg)
{
int vm_fd = (int)(long)arg;
create_vcpu(vm_fd);
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t thread;
int vm_fd;
(void)argc;
(void)argv;
vm_fd = create_vm();
pthread_create(&thread, NULL, vcpu_thread, (void *)(long)vm_fd);
printf("Exit.\n");
return 0;
}
It caused by release kvm->arch.ept_identity_map_addr which is the
error page.
The parent thread can send KILL signal to the vcpu thread when it was
exiting which stops faulting pages and potentially allocating memory.
So gfn_to_pfn/gfn_to_page may fail at this time
Fixed by checking the page before it is used
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-07 10:14:20 +04:00
return 0 ;
2007-10-25 18:52:32 +04:00
}
static void vapic_exit ( struct kvm_vcpu * vcpu )
{
struct kvm_lapic * apic = vcpu - > arch . apic ;
2009-12-23 19:35:25 +03:00
int idx ;
2007-10-25 18:52:32 +04:00
if ( ! apic | | ! apic - > vapic_addr )
return ;
2009-12-23 19:35:25 +03:00
idx = srcu_read_lock ( & vcpu - > kvm - > srcu ) ;
2007-10-25 18:52:32 +04:00
kvm_release_page_dirty ( apic - > vapic_page ) ;
mark_page_dirty ( vcpu - > kvm , apic - > vapic_addr > > PAGE_SHIFT ) ;
2009-12-23 19:35:25 +03:00
srcu_read_unlock ( & vcpu - > kvm - > srcu , idx ) ;
2007-10-25 18:52:32 +04:00
}
2009-04-21 18:45:08 +04:00
static void update_cr8_intercept ( struct kvm_vcpu * vcpu )
{
int max_irr , tpr ;
if ( ! kvm_x86_ops - > update_cr8_intercept )
return ;
2009-08-17 23:49:40 +04:00
if ( ! vcpu - > arch . apic )
return ;
2009-05-11 14:35:54 +04:00
if ( ! vcpu - > arch . apic - > vapic_addr )
max_irr = kvm_lapic_find_highest_irr ( vcpu ) ;
else
max_irr = - 1 ;
2009-04-21 18:45:08 +04:00
if ( max_irr ! = - 1 )
max_irr > > = 4 ;
tpr = kvm_lapic_get_cr8 ( vcpu ) ;
kvm_x86_ops - > update_cr8_intercept ( vcpu , tpr , max_irr ) ;
}
2009-08-24 12:10:17 +04:00
static void inject_pending_event ( struct kvm_vcpu * vcpu )
2009-04-21 18:45:08 +04:00
{
/* try to reinject previous events if any */
2009-07-09 16:33:51 +04:00
if ( vcpu - > arch . exception . pending ) {
2010-03-11 14:01:59 +03:00
trace_kvm_inj_exception ( vcpu - > arch . exception . nr ,
vcpu - > arch . exception . has_error_code ,
vcpu - > arch . exception . error_code ) ;
2009-07-09 16:33:51 +04:00
kvm_x86_ops - > queue_exception ( vcpu , vcpu - > arch . exception . nr ,
vcpu - > arch . exception . has_error_code ,
2010-04-22 14:33:13 +04:00
vcpu - > arch . exception . error_code ,
vcpu - > arch . exception . reinject ) ;
2009-07-09 16:33:51 +04:00
return ;
}
2009-04-21 18:45:08 +04:00
if ( vcpu - > arch . nmi_injected ) {
kvm_x86_ops - > set_nmi ( vcpu ) ;
return ;
}
if ( vcpu - > arch . interrupt . pending ) {
2009-05-11 14:35:50 +04:00
kvm_x86_ops - > set_irq ( vcpu ) ;
2009-04-21 18:45:08 +04:00
return ;
}
/* try to inject new event if pending */
if ( vcpu - > arch . nmi_pending ) {
if ( kvm_x86_ops - > nmi_allowed ( vcpu ) ) {
2011-09-20 14:43:14 +04:00
- - vcpu - > arch . nmi_pending ;
2009-04-21 18:45:08 +04:00
vcpu - > arch . nmi_injected = true ;
kvm_x86_ops - > set_nmi ( vcpu ) ;
}
} else if ( kvm_cpu_has_interrupt ( vcpu ) ) {
if ( kvm_x86_ops - > interrupt_allowed ( vcpu ) ) {
2009-05-11 14:35:50 +04:00
kvm_queue_interrupt ( vcpu , kvm_cpu_get_interrupt ( vcpu ) ,
false ) ;
kvm_x86_ops - > set_irq ( vcpu ) ;
2009-04-21 18:45:08 +04:00
}
}
}
2010-06-10 07:27:12 +04:00
static void kvm_load_guest_xcr0 ( struct kvm_vcpu * vcpu )
{
if ( kvm_read_cr4_bits ( vcpu , X86_CR4_OSXSAVE ) & &
! vcpu - > guest_xcr0_loaded ) {
/* kvm_set_xcr() also depends on this */
xsetbv ( XCR_XFEATURE_ENABLED_MASK , vcpu - > arch . xcr0 ) ;
vcpu - > guest_xcr0_loaded = 1 ;
}
}
static void kvm_put_guest_xcr0 ( struct kvm_vcpu * vcpu )
{
if ( vcpu - > guest_xcr0_loaded ) {
if ( vcpu - > arch . xcr0 ! = host_xcr0 )
xsetbv ( XCR_XFEATURE_ENABLED_MASK , host_xcr0 ) ;
vcpu - > guest_xcr0_loaded = 0 ;
}
}
2011-09-20 14:43:14 +04:00
static void process_nmi ( struct kvm_vcpu * vcpu )
{
unsigned limit = 2 ;
/*
* x86 is limited to one NMI running , and one NMI pending after it .
* If an NMI is already in progress , limit further NMIs to just one .
* Otherwise , allow two ( and we ' ll inject the first one immediately ) .
*/
if ( kvm_x86_ops - > get_nmi_mask ( vcpu ) | | vcpu - > arch . nmi_injected )
limit = 1 ;
vcpu - > arch . nmi_pending + = atomic_xchg ( & vcpu - > arch . nmi_queued , 0 ) ;
vcpu - > arch . nmi_pending = min ( vcpu - > arch . nmi_pending , limit ) ;
kvm_make_request ( KVM_REQ_EVENT , vcpu ) ;
}
2009-08-24 12:10:17 +04:00
static int vcpu_enter_guest ( struct kvm_vcpu * vcpu )
2007-11-01 22:16:10 +03:00
{
int r ;
2009-05-11 14:35:51 +04:00
bool req_int_win = ! irqchip_in_kernel ( vcpu - > kvm ) & &
2009-08-24 12:10:17 +04:00
vcpu - > run - > request_interrupt_window ;
KVM: nVMX: Add KVM_REQ_IMMEDIATE_EXIT
This patch adds a new vcpu->requests bit, KVM_REQ_IMMEDIATE_EXIT.
This bit requests that when next entering the guest, we should run it only
for as little as possible, and exit again.
We use this new option in nested VMX: When L1 launches L2, but L0 wishes L1
to continue running so it can inject an event to it, we unfortunately cannot
just pretend to have run L2 for a little while - We must really launch L2,
otherwise certain one-off vmcs12 parameters (namely, L1 injection into L2)
will be lost. So the existing code runs L2 in this case.
But L2 could potentially run for a long time until it exits, and the
injection into L1 will be delayed. The new KVM_REQ_IMMEDIATE_EXIT allows us
to request that L2 will be entered, as necessary, but will exit as soon as
possible after entry.
Our implementation of this request uses smp_send_reschedule() to send a
self-IPI, with interrupts disabled. The interrupts remain disabled until the
guest is entered, and then, after the entry is complete (often including
processing an injection and jumping to the relevant handler), the physical
interrupt is noticed and causes an exit.
On recent Intel processors, we could have achieved the same goal by using
MTF instead of a self-IPI. Another technique worth considering in the future
is to use VM_EXIT_ACK_INTR_ON_EXIT and a highest-priority vector IPI - to
slightly improve performance by avoiding the useless interrupt handler
which ends up being called when smp_send_reschedule() is used.
Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-22 14:52:56 +04:00
bool req_immediate_exit = 0 ;
2007-11-01 22:16:10 +03:00
2010-06-23 15:26:18 +04:00
if ( vcpu - > requests ) {
2010-05-10 13:34:53 +04:00
if ( kvm_check_request ( KVM_REQ_MMU_RELOAD , vcpu ) )
2008-02-20 22:47:24 +03:00
kvm_mmu_unload ( vcpu ) ;
2010-05-10 13:34:53 +04:00
if ( kvm_check_request ( KVM_REQ_MIGRATE_TIMER , vcpu ) )
2008-05-27 19:10:20 +04:00
__kvm_migrate_timers ( vcpu ) ;
2010-09-19 04:38:14 +04:00
if ( kvm_check_request ( KVM_REQ_CLOCK_UPDATE , vcpu ) ) {
r = kvm_guest_time_update ( vcpu ) ;
2010-08-20 12:07:21 +04:00
if ( unlikely ( r ) )
goto out ;
}
2010-05-10 13:34:53 +04:00
if ( kvm_check_request ( KVM_REQ_MMU_SYNC , vcpu ) )
2008-09-23 20:18:39 +04:00
kvm_mmu_sync_roots ( vcpu ) ;
2010-05-10 13:34:53 +04:00
if ( kvm_check_request ( KVM_REQ_TLB_FLUSH , vcpu ) )
2008-06-06 23:37:35 +04:00
kvm_x86_ops - > tlb_flush ( vcpu ) ;
2010-05-10 13:34:53 +04:00
if ( kvm_check_request ( KVM_REQ_REPORT_TPR_ACCESS , vcpu ) ) {
2009-08-24 12:10:17 +04:00
vcpu - > run - > exit_reason = KVM_EXIT_TPR_ACCESS ;
2007-10-25 18:52:32 +04:00
r = 0 ;
goto out ;
}
2010-05-10 13:34:53 +04:00
if ( kvm_check_request ( KVM_REQ_TRIPLE_FAULT , vcpu ) ) {
2009-08-24 12:10:17 +04:00
vcpu - > run - > exit_reason = KVM_EXIT_SHUTDOWN ;
2008-02-26 18:49:16 +03:00
r = 0 ;
goto out ;
}
2010-05-10 13:34:53 +04:00
if ( kvm_check_request ( KVM_REQ_DEACTIVATE_FPU , vcpu ) ) {
2009-12-30 13:40:26 +03:00
vcpu - > fpu_active = 0 ;
kvm_x86_ops - > fpu_deactivate ( vcpu ) ;
}
2010-10-14 13:22:46 +04:00
if ( kvm_check_request ( KVM_REQ_APF_HALT , vcpu ) ) {
/* Page is swapped out. Do synthetic halt */
vcpu - > arch . apf . halted = true ;
r = 1 ;
goto out ;
}
2011-07-11 23:28:14 +04:00
if ( kvm_check_request ( KVM_REQ_STEAL_UPDATE , vcpu ) )
record_steal_time ( vcpu ) ;
2011-09-20 14:43:14 +04:00
if ( kvm_check_request ( KVM_REQ_NMI , vcpu ) )
process_nmi ( vcpu ) ;
KVM: nVMX: Add KVM_REQ_IMMEDIATE_EXIT
This patch adds a new vcpu->requests bit, KVM_REQ_IMMEDIATE_EXIT.
This bit requests that when next entering the guest, we should run it only
for as little as possible, and exit again.
We use this new option in nested VMX: When L1 launches L2, but L0 wishes L1
to continue running so it can inject an event to it, we unfortunately cannot
just pretend to have run L2 for a little while - We must really launch L2,
otherwise certain one-off vmcs12 parameters (namely, L1 injection into L2)
will be lost. So the existing code runs L2 in this case.
But L2 could potentially run for a long time until it exits, and the
injection into L1 will be delayed. The new KVM_REQ_IMMEDIATE_EXIT allows us
to request that L2 will be entered, as necessary, but will exit as soon as
possible after entry.
Our implementation of this request uses smp_send_reschedule() to send a
self-IPI, with interrupts disabled. The interrupts remain disabled until the
guest is entered, and then, after the entry is complete (often including
processing an injection and jumping to the relevant handler), the physical
interrupt is noticed and causes an exit.
On recent Intel processors, we could have achieved the same goal by using
MTF instead of a self-IPI. Another technique worth considering in the future
is to use VM_EXIT_ACK_INTR_ON_EXIT and a highest-priority vector IPI - to
slightly improve performance by avoiding the useless interrupt handler
which ends up being called when smp_send_reschedule() is used.
Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-22 14:52:56 +04:00
req_immediate_exit =
kvm_check_request ( KVM_REQ_IMMEDIATE_EXIT , vcpu ) ;
2011-11-10 16:57:22 +04:00
if ( kvm_check_request ( KVM_REQ_PMU , vcpu ) )
kvm_handle_pmu_event ( vcpu ) ;
if ( kvm_check_request ( KVM_REQ_PMI , vcpu ) )
kvm_deliver_pmi ( vcpu ) ;
2008-01-16 13:49:30 +03:00
}
2007-10-25 18:52:32 +04:00
2010-07-20 16:06:17 +04:00
if ( kvm_check_request ( KVM_REQ_EVENT , vcpu ) | | req_int_win ) {
inject_pending_event ( vcpu ) ;
/* enable NMI/IRQ window open exits if needed */
2011-09-20 14:43:14 +04:00
if ( vcpu - > arch . nmi_pending )
2010-07-20 16:06:17 +04:00
kvm_x86_ops - > enable_nmi_window ( vcpu ) ;
else if ( kvm_cpu_has_interrupt ( vcpu ) | | req_int_win )
kvm_x86_ops - > enable_irq_window ( vcpu ) ;
if ( kvm_lapic_enabled ( vcpu ) ) {
update_cr8_intercept ( vcpu ) ;
kvm_lapic_sync_to_vapic ( vcpu ) ;
}
}
2012-05-14 19:07:56 +04:00
r = kvm_mmu_reload ( vcpu ) ;
if ( unlikely ( r ) ) {
2012-06-24 20:25:00 +04:00
goto cancel_injection ;
2012-05-14 19:07:56 +04:00
}
2007-11-01 22:16:10 +03:00
preempt_disable ( ) ;
kvm_x86_ops - > prepare_guest_switch ( vcpu ) ;
2010-01-21 16:31:45 +03:00
if ( vcpu - > fpu_active )
kvm_load_guest_fpu ( vcpu ) ;
2010-06-10 07:27:12 +04:00
kvm_load_guest_xcr0 ( vcpu ) ;
2007-11-01 22:16:10 +03:00
2011-01-12 10:40:31 +03:00
vcpu - > mode = IN_GUEST_MODE ;
/* We should set ->mode before check ->requests,
* see the comment in make_all_cpus_request .
*/
smp_mb ( ) ;
2007-11-01 22:16:10 +03:00
2010-05-03 17:54:48 +04:00
local_irq_disable ( ) ;
2009-05-08 00:55:12 +04:00
2011-01-12 10:40:31 +03:00
if ( vcpu - > mode = = EXITING_GUEST_MODE | | vcpu - > requests
2010-05-03 17:54:48 +04:00
| | need_resched ( ) | | signal_pending ( current ) ) {
2011-01-12 10:40:31 +03:00
vcpu - > mode = OUTSIDE_GUEST_MODE ;
2010-05-03 17:54:48 +04:00
smp_wmb ( ) ;
2008-01-15 19:27:32 +03:00
local_irq_enable ( ) ;
preempt_enable ( ) ;
r = 1 ;
2012-06-24 20:25:00 +04:00
goto cancel_injection ;
2008-01-15 19:27:32 +03:00
}
2009-12-23 19:35:25 +03:00
srcu_read_unlock ( & vcpu - > kvm - > srcu , vcpu - > srcu_idx ) ;
2008-03-30 03:17:59 +04:00
KVM: nVMX: Add KVM_REQ_IMMEDIATE_EXIT
This patch adds a new vcpu->requests bit, KVM_REQ_IMMEDIATE_EXIT.
This bit requests that when next entering the guest, we should run it only
for as little as possible, and exit again.
We use this new option in nested VMX: When L1 launches L2, but L0 wishes L1
to continue running so it can inject an event to it, we unfortunately cannot
just pretend to have run L2 for a little while - We must really launch L2,
otherwise certain one-off vmcs12 parameters (namely, L1 injection into L2)
will be lost. So the existing code runs L2 in this case.
But L2 could potentially run for a long time until it exits, and the
injection into L1 will be delayed. The new KVM_REQ_IMMEDIATE_EXIT allows us
to request that L2 will be entered, as necessary, but will exit as soon as
possible after entry.
Our implementation of this request uses smp_send_reschedule() to send a
self-IPI, with interrupts disabled. The interrupts remain disabled until the
guest is entered, and then, after the entry is complete (often including
processing an injection and jumping to the relevant handler), the physical
interrupt is noticed and causes an exit.
On recent Intel processors, we could have achieved the same goal by using
MTF instead of a self-IPI. Another technique worth considering in the future
is to use VM_EXIT_ACK_INTR_ON_EXIT and a highest-priority vector IPI - to
slightly improve performance by avoiding the useless interrupt handler
which ends up being called when smp_send_reschedule() is used.
Signed-off-by: Nadav Har'El <nyh@il.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2011-09-22 14:52:56 +04:00
if ( req_immediate_exit )
smp_send_reschedule ( vcpu - > cpu ) ;
2007-11-01 22:16:10 +03:00
kvm_guest_enter ( ) ;
2008-12-15 15:52:10 +03:00
if ( unlikely ( vcpu - > arch . switch_db_regs ) ) {
set_debugreg ( 0 , 7 ) ;
set_debugreg ( vcpu - > arch . eff_db [ 0 ] , 0 ) ;
set_debugreg ( vcpu - > arch . eff_db [ 1 ] , 1 ) ;
set_debugreg ( vcpu - > arch . eff_db [ 2 ] , 2 ) ;
set_debugreg ( vcpu - > arch . eff_db [ 3 ] , 3 ) ;
}
2007-11-01 22:16:10 +03:00
2009-06-17 16:22:14 +04:00
trace_kvm_entry ( vcpu - > vcpu_id ) ;
2009-08-24 12:10:17 +04:00
kvm_x86_ops - > run ( vcpu ) ;
2007-11-01 22:16:10 +03:00
2009-09-09 21:22:48 +04:00
/*
* If the guest has used debug registers , at least dr7
* will be disabled while returning to the host .
* If we don ' t have active breakpoints in the host , we don ' t
* care about the messed up debug address registers . But if
* we have some of them active , restore the old state .
*/
2009-11-10 13:03:12 +03:00
if ( hw_breakpoint_active ( ) )
2009-09-09 21:22:48 +04:00
hw_breakpoint_restore ( ) ;
2008-12-15 15:52:10 +03:00
2011-08-02 16:54:20 +04:00
vcpu - > arch . last_guest_tsc = kvm_x86_ops - > read_l1_tsc ( vcpu ) ;
2010-08-20 12:07:30 +04:00
2011-01-12 10:40:31 +03:00
vcpu - > mode = OUTSIDE_GUEST_MODE ;
2010-05-03 17:54:48 +04:00
smp_wmb ( ) ;
2007-11-01 22:16:10 +03:00
local_irq_enable ( ) ;
+ + vcpu - > stat . exits ;
/*
* We must have an instruction between local_irq_enable ( ) and
* kvm_guest_exit ( ) , so the timer interrupt isn ' t delayed by
* the interrupt shadow . The stat . exits increment will do nicely .
* But we need to prevent reordering , hence this barrier ( ) :
*/
barrier ( ) ;
kvm_guest_exit ( ) ;
preempt_enable ( ) ;
2009-12-23 19:35:25 +03:00
vcpu - > srcu_idx = srcu_read_lock ( & vcpu - > kvm - > srcu ) ;
2008-03-30 03:17:59 +04:00
2007-11-01 22:16:10 +03:00
/*
* Profile KVM exit RIPs :
*/
if ( unlikely ( prof_on = = KVM_PROFILING ) ) {
2008-06-27 21:58:02 +04:00
unsigned long rip = kvm_rip_read ( vcpu ) ;
profile_hit ( KVM_PROFILING , ( void * ) rip ) ;
2007-11-01 22:16:10 +03:00
}
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
if ( unlikely ( vcpu - > arch . tsc_always_catchup ) )
kvm_make_request ( KVM_REQ_CLOCK_UPDATE , vcpu ) ;
2007-11-25 14:41:11 +03:00
2012-06-24 20:24:54 +04:00
if ( vcpu - > arch . apic_attention )
kvm_lapic_sync_from_vapic ( vcpu ) ;
2007-10-25 18:52:32 +04:00
2009-08-24 12:10:17 +04:00
r = kvm_x86_ops - > handle_exit ( vcpu ) ;
2012-06-24 20:25:00 +04:00
return r ;
cancel_injection :
kvm_x86_ops - > cancel_injection ( vcpu ) ;
2012-06-24 20:25:07 +04:00
if ( unlikely ( vcpu - > arch . apic_attention ) )
kvm_lapic_sync_from_vapic ( vcpu ) ;
2008-09-08 22:23:48 +04:00
out :
return r ;
}
2007-11-01 22:16:10 +03:00
2009-03-23 16:11:44 +03:00
2009-08-24 12:10:17 +04:00
static int __vcpu_run ( struct kvm_vcpu * vcpu )
2008-09-08 22:23:48 +04:00
{
int r ;
2009-12-23 19:35:25 +03:00
struct kvm * kvm = vcpu - > kvm ;
2008-09-08 22:23:48 +04:00
if ( unlikely ( vcpu - > arch . mp_state = = KVM_MP_STATE_SIPI_RECEIVED ) ) {
2008-09-30 12:41:06 +04:00
pr_debug ( " vcpu %d received sipi with vector # %x \n " ,
vcpu - > vcpu_id , vcpu - > arch . sipi_vector ) ;
2008-09-08 22:23:48 +04:00
kvm_lapic_reset ( vcpu ) ;
2008-10-07 17:42:33 +04:00
r = kvm_arch_vcpu_reset ( vcpu ) ;
2008-09-08 22:23:48 +04:00
if ( r )
return r ;
vcpu - > arch . mp_state = KVM_MP_STATE_RUNNABLE ;
2007-11-01 22:16:10 +03:00
}
2009-12-23 19:35:25 +03:00
vcpu - > srcu_idx = srcu_read_lock ( & kvm - > srcu ) ;
KVM: fix error paths for failed gfn_to_page() calls
This bug was triggered:
[ 4220.198458] BUG: unable to handle kernel paging request at fffffffffffffffe
[ 4220.203907] IP: [<ffffffff81104d85>] put_page+0xf/0x34
......
[ 4220.237326] Call Trace:
[ 4220.237361] [<ffffffffa03830d0>] kvm_arch_destroy_vm+0xf9/0x101 [kvm]
[ 4220.237382] [<ffffffffa036fe53>] kvm_put_kvm+0xcc/0x127 [kvm]
[ 4220.237401] [<ffffffffa03702bc>] kvm_vcpu_release+0x18/0x1c [kvm]
[ 4220.237407] [<ffffffff81145425>] __fput+0x111/0x1ed
[ 4220.237411] [<ffffffff8114550f>] ____fput+0xe/0x10
[ 4220.237418] [<ffffffff81063511>] task_work_run+0x5d/0x88
[ 4220.237424] [<ffffffff8104c3f7>] do_exit+0x2bf/0x7ca
The test case:
printf(fmt, ##args); \
exit(-1);} while (0)
static int create_vm(void)
{
int sys_fd, vm_fd;
sys_fd = open("/dev/kvm", O_RDWR);
if (sys_fd < 0)
die("open /dev/kvm fail.\n");
vm_fd = ioctl(sys_fd, KVM_CREATE_VM, 0);
if (vm_fd < 0)
die("KVM_CREATE_VM fail.\n");
return vm_fd;
}
static int create_vcpu(int vm_fd)
{
int vcpu_fd;
vcpu_fd = ioctl(vm_fd, KVM_CREATE_VCPU, 0);
if (vcpu_fd < 0)
die("KVM_CREATE_VCPU ioctl.\n");
printf("Create vcpu.\n");
return vcpu_fd;
}
static void *vcpu_thread(void *arg)
{
int vm_fd = (int)(long)arg;
create_vcpu(vm_fd);
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t thread;
int vm_fd;
(void)argc;
(void)argv;
vm_fd = create_vm();
pthread_create(&thread, NULL, vcpu_thread, (void *)(long)vm_fd);
printf("Exit.\n");
return 0;
}
It caused by release kvm->arch.ept_identity_map_addr which is the
error page.
The parent thread can send KILL signal to the vcpu thread when it was
exiting which stops faulting pages and potentially allocating memory.
So gfn_to_pfn/gfn_to_page may fail at this time
Fixed by checking the page before it is used
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-09-07 10:14:20 +04:00
r = vapic_enter ( vcpu ) ;
if ( r ) {
srcu_read_unlock ( & kvm - > srcu , vcpu - > srcu_idx ) ;
return r ;
}
2008-09-08 22:23:48 +04:00
r = 1 ;
while ( r > 0 ) {
2010-10-14 13:22:46 +04:00
if ( vcpu - > arch . mp_state = = KVM_MP_STATE_RUNNABLE & &
! vcpu - > arch . apf . halted )
2009-08-24 12:10:17 +04:00
r = vcpu_enter_guest ( vcpu ) ;
2008-09-08 22:23:48 +04:00
else {
2009-12-23 19:35:25 +03:00
srcu_read_unlock ( & kvm - > srcu , vcpu - > srcu_idx ) ;
2008-09-08 22:23:48 +04:00
kvm_vcpu_block ( vcpu ) ;
2009-12-23 19:35:25 +03:00
vcpu - > srcu_idx = srcu_read_lock ( & kvm - > srcu ) ;
2010-05-10 13:34:53 +04:00
if ( kvm_check_request ( KVM_REQ_UNHALT , vcpu ) )
2009-03-23 16:11:44 +03:00
{
switch ( vcpu - > arch . mp_state ) {
case KVM_MP_STATE_HALTED :
2008-09-08 22:23:48 +04:00
vcpu - > arch . mp_state =
2009-03-23 16:11:44 +03:00
KVM_MP_STATE_RUNNABLE ;
case KVM_MP_STATE_RUNNABLE :
2010-10-14 13:22:46 +04:00
vcpu - > arch . apf . halted = false ;
2009-03-23 16:11:44 +03:00
break ;
case KVM_MP_STATE_SIPI_RECEIVED :
default :
r = - EINTR ;
break ;
}
}
2008-09-08 22:23:48 +04:00
}
2009-03-23 16:11:44 +03:00
if ( r < = 0 )
break ;
clear_bit ( KVM_REQ_PENDING_TIMER , & vcpu - > requests ) ;
if ( kvm_cpu_has_pending_timer ( vcpu ) )
kvm_inject_pending_timer_irqs ( vcpu ) ;
2009-08-24 12:10:17 +04:00
if ( dm_request_for_irq_injection ( vcpu ) ) {
2009-03-23 16:11:44 +03:00
r = - EINTR ;
2009-08-24 12:10:17 +04:00
vcpu - > run - > exit_reason = KVM_EXIT_INTR ;
2009-03-23 16:11:44 +03:00
+ + vcpu - > stat . request_irq_exits ;
}
2010-10-14 13:22:46 +04:00
kvm_check_async_pf_completion ( vcpu ) ;
2009-03-23 16:11:44 +03:00
if ( signal_pending ( current ) ) {
r = - EINTR ;
2009-08-24 12:10:17 +04:00
vcpu - > run - > exit_reason = KVM_EXIT_INTR ;
2009-03-23 16:11:44 +03:00
+ + vcpu - > stat . signal_exits ;
}
if ( need_resched ( ) ) {
2009-12-23 19:35:25 +03:00
srcu_read_unlock ( & kvm - > srcu , vcpu - > srcu_idx ) ;
2009-03-23 16:11:44 +03:00
kvm_resched ( vcpu ) ;
2009-12-23 19:35:25 +03:00
vcpu - > srcu_idx = srcu_read_lock ( & kvm - > srcu ) ;
2008-09-08 22:23:48 +04:00
}
2007-11-01 22:16:10 +03:00
}
2009-12-23 19:35:25 +03:00
srcu_read_unlock ( & kvm - > srcu , vcpu - > srcu_idx ) ;
2007-11-01 22:16:10 +03:00
2007-10-25 18:52:32 +04:00
vapic_exit ( vcpu ) ;
2007-11-01 22:16:10 +03:00
return r ;
}
2012-04-18 20:22:47 +04:00
/*
* Implements the following , as a state machine :
*
* read :
* for each fragment
* write gpa , len
* exit
* copy data
* execute insn
*
* write :
* for each fragment
* write gpa , len
* copy data
* exit
*/
2010-01-19 15:20:10 +03:00
static int complete_mmio ( struct kvm_vcpu * vcpu )
{
struct kvm_run * run = vcpu - > run ;
2012-04-18 20:22:47 +04:00
struct kvm_mmio_fragment * frag ;
2010-01-19 15:20:10 +03:00
int r ;
if ( ! ( vcpu - > arch . pio . count | | vcpu - > mmio_needed ) )
return 1 ;
if ( vcpu - > mmio_needed ) {
2012-04-18 20:22:47 +04:00
/* Complete previous fragment */
frag = & vcpu - > mmio_fragments [ vcpu - > mmio_cur_fragment + + ] ;
2010-01-20 13:01:20 +03:00
if ( ! vcpu - > mmio_is_write )
2012-04-18 20:22:47 +04:00
memcpy ( frag - > data , run - > mmio . data , frag - > len ) ;
if ( vcpu - > mmio_cur_fragment = = vcpu - > mmio_nr_fragments ) {
vcpu - > mmio_needed = 0 ;
if ( vcpu - > mmio_is_write )
return 1 ;
vcpu - > mmio_read_completed = 1 ;
goto done ;
2010-01-20 13:01:20 +03:00
}
2012-04-18 20:22:47 +04:00
/* Initiate next fragment */
+ + frag ;
run - > exit_reason = KVM_EXIT_MMIO ;
run - > mmio . phys_addr = frag - > gpa ;
2010-01-20 13:01:20 +03:00
if ( vcpu - > mmio_is_write )
2012-04-18 20:22:47 +04:00
memcpy ( run - > mmio . data , frag - > data , frag - > len ) ;
run - > mmio . len = frag - > len ;
run - > mmio . is_write = vcpu - > mmio_is_write ;
return 0 ;
2010-01-19 15:20:10 +03:00
}
2012-04-18 20:22:47 +04:00
done :
2010-01-19 15:20:10 +03:00
vcpu - > srcu_idx = srcu_read_lock ( & vcpu - > kvm - > srcu ) ;
r = emulate_instruction ( vcpu , EMULTYPE_NO_DECODE ) ;
srcu_read_unlock ( & vcpu - > kvm - > srcu , vcpu - > srcu_idx ) ;
if ( r ! = EMULATE_DONE )
return 0 ;
return 1 ;
}
2007-11-01 22:16:10 +03:00
int kvm_arch_vcpu_ioctl_run ( struct kvm_vcpu * vcpu , struct kvm_run * kvm_run )
{
int r ;
sigset_t sigsaved ;
2011-01-11 13:15:54 +03:00
if ( ! tsk_used_math ( current ) & & init_fpu ( current ) )
return - ENOMEM ;
2008-07-06 16:48:31 +04:00
if ( vcpu - > sigset_active )
sigprocmask ( SIG_SETMASK , & vcpu - > sigset , & sigsaved ) ;
2008-04-13 18:54:35 +04:00
if ( unlikely ( vcpu - > arch . mp_state = = KVM_MP_STATE_UNINITIALIZED ) ) {
2007-11-01 22:16:10 +03:00
kvm_vcpu_block ( vcpu ) ;
2008-09-08 22:23:48 +04:00
clear_bit ( KVM_REQ_UNHALT , & vcpu - > requests ) ;
2008-07-06 16:48:31 +04:00
r = - EAGAIN ;
goto out ;
2007-11-01 22:16:10 +03:00
}
/* re-sync apic's tpr */
2010-12-21 13:12:00 +03:00
if ( ! irqchip_in_kernel ( vcpu - > kvm ) ) {
if ( kvm_set_cr8 ( vcpu , kvm_run - > cr8 ) ! = 0 ) {
r = - EINVAL ;
goto out ;
}
}
2007-11-01 22:16:10 +03:00
2010-01-19 15:20:10 +03:00
r = complete_mmio ( vcpu ) ;
if ( r < = 0 )
goto out ;
2009-08-24 12:10:17 +04:00
r = __vcpu_run ( vcpu ) ;
2007-11-01 22:16:10 +03:00
out :
2010-05-04 06:04:27 +04:00
post_kvm_run_save ( vcpu ) ;
2007-11-01 22:16:10 +03:00
if ( vcpu - > sigset_active )
sigprocmask ( SIG_SETMASK , & sigsaved , NULL ) ;
return r ;
}
int kvm_arch_vcpu_ioctl_get_regs ( struct kvm_vcpu * vcpu , struct kvm_regs * regs )
{
2011-03-31 14:06:41 +04:00
if ( vcpu - > arch . emulate_regs_need_sync_to_vcpu ) {
/*
* We are here if userspace calls get_regs ( ) in the middle of
* instruction emulation . Registers state needs to be copied
* back from emulation context to vcpu . Usrapace shouldn ' t do
* that usually , but some bad designed PV devices ( vmware
* backdoor interface ) need this to work
*/
2011-06-01 16:34:25 +04:00
struct x86_emulate_ctxt * ctxt = & vcpu - > arch . emulate_ctxt ;
memcpy ( vcpu - > arch . regs , ctxt - > regs , sizeof ctxt - > regs ) ;
2011-03-31 14:06:41 +04:00
vcpu - > arch . emulate_regs_need_sync_to_vcpu = false ;
}
2008-06-27 21:58:02 +04:00
regs - > rax = kvm_register_read ( vcpu , VCPU_REGS_RAX ) ;
regs - > rbx = kvm_register_read ( vcpu , VCPU_REGS_RBX ) ;
regs - > rcx = kvm_register_read ( vcpu , VCPU_REGS_RCX ) ;
regs - > rdx = kvm_register_read ( vcpu , VCPU_REGS_RDX ) ;
regs - > rsi = kvm_register_read ( vcpu , VCPU_REGS_RSI ) ;
regs - > rdi = kvm_register_read ( vcpu , VCPU_REGS_RDI ) ;
regs - > rsp = kvm_register_read ( vcpu , VCPU_REGS_RSP ) ;
regs - > rbp = kvm_register_read ( vcpu , VCPU_REGS_RBP ) ;
2007-11-01 22:16:10 +03:00
# ifdef CONFIG_X86_64
2008-06-27 21:58:02 +04:00
regs - > r8 = kvm_register_read ( vcpu , VCPU_REGS_R8 ) ;
regs - > r9 = kvm_register_read ( vcpu , VCPU_REGS_R9 ) ;
regs - > r10 = kvm_register_read ( vcpu , VCPU_REGS_R10 ) ;
regs - > r11 = kvm_register_read ( vcpu , VCPU_REGS_R11 ) ;
regs - > r12 = kvm_register_read ( vcpu , VCPU_REGS_R12 ) ;
regs - > r13 = kvm_register_read ( vcpu , VCPU_REGS_R13 ) ;
regs - > r14 = kvm_register_read ( vcpu , VCPU_REGS_R14 ) ;
regs - > r15 = kvm_register_read ( vcpu , VCPU_REGS_R15 ) ;
2007-11-01 22:16:10 +03:00
# endif
2008-06-27 21:58:02 +04:00
regs - > rip = kvm_rip_read ( vcpu ) ;
2009-10-05 15:07:21 +04:00
regs - > rflags = kvm_get_rflags ( vcpu ) ;
2007-11-01 22:16:10 +03:00
return 0 ;
}
int kvm_arch_vcpu_ioctl_set_regs ( struct kvm_vcpu * vcpu , struct kvm_regs * regs )
{
2011-03-31 14:06:41 +04:00
vcpu - > arch . emulate_regs_need_sync_from_vcpu = true ;
vcpu - > arch . emulate_regs_need_sync_to_vcpu = false ;
2008-06-27 21:58:02 +04:00
kvm_register_write ( vcpu , VCPU_REGS_RAX , regs - > rax ) ;
kvm_register_write ( vcpu , VCPU_REGS_RBX , regs - > rbx ) ;
kvm_register_write ( vcpu , VCPU_REGS_RCX , regs - > rcx ) ;
kvm_register_write ( vcpu , VCPU_REGS_RDX , regs - > rdx ) ;
kvm_register_write ( vcpu , VCPU_REGS_RSI , regs - > rsi ) ;
kvm_register_write ( vcpu , VCPU_REGS_RDI , regs - > rdi ) ;
kvm_register_write ( vcpu , VCPU_REGS_RSP , regs - > rsp ) ;
kvm_register_write ( vcpu , VCPU_REGS_RBP , regs - > rbp ) ;
2007-11-01 22:16:10 +03:00
# ifdef CONFIG_X86_64
2008-06-27 21:58:02 +04:00
kvm_register_write ( vcpu , VCPU_REGS_R8 , regs - > r8 ) ;
kvm_register_write ( vcpu , VCPU_REGS_R9 , regs - > r9 ) ;
kvm_register_write ( vcpu , VCPU_REGS_R10 , regs - > r10 ) ;
kvm_register_write ( vcpu , VCPU_REGS_R11 , regs - > r11 ) ;
kvm_register_write ( vcpu , VCPU_REGS_R12 , regs - > r12 ) ;
kvm_register_write ( vcpu , VCPU_REGS_R13 , regs - > r13 ) ;
kvm_register_write ( vcpu , VCPU_REGS_R14 , regs - > r14 ) ;
kvm_register_write ( vcpu , VCPU_REGS_R15 , regs - > r15 ) ;
2007-11-01 22:16:10 +03:00
# endif
2008-06-27 21:58:02 +04:00
kvm_rip_write ( vcpu , regs - > rip ) ;
2009-10-05 15:07:21 +04:00
kvm_set_rflags ( vcpu , regs - > rflags ) ;
2007-11-01 22:16:10 +03:00
2008-04-30 19:59:04 +04:00
vcpu - > arch . exception . pending = false ;
2010-07-27 13:30:24 +04:00
kvm_make_request ( KVM_REQ_EVENT , vcpu ) ;
2007-11-01 22:16:10 +03:00
return 0 ;
}
void kvm_get_cs_db_l_bits ( struct kvm_vcpu * vcpu , int * db , int * l )
{
struct kvm_segment cs ;
2008-05-27 12:18:46 +04:00
kvm_get_segment ( vcpu , & cs , VCPU_SREG_CS ) ;
2007-11-01 22:16:10 +03:00
* db = cs . db ;
* l = cs . l ;
}
EXPORT_SYMBOL_GPL ( kvm_get_cs_db_l_bits ) ;
int kvm_arch_vcpu_ioctl_get_sregs ( struct kvm_vcpu * vcpu ,
struct kvm_sregs * sregs )
{
2010-02-16 11:51:48 +03:00
struct desc_ptr dt ;
2007-11-01 22:16:10 +03:00
2008-05-27 12:18:46 +04:00
kvm_get_segment ( vcpu , & sregs - > cs , VCPU_SREG_CS ) ;
kvm_get_segment ( vcpu , & sregs - > ds , VCPU_SREG_DS ) ;
kvm_get_segment ( vcpu , & sregs - > es , VCPU_SREG_ES ) ;
kvm_get_segment ( vcpu , & sregs - > fs , VCPU_SREG_FS ) ;
kvm_get_segment ( vcpu , & sregs - > gs , VCPU_SREG_GS ) ;
kvm_get_segment ( vcpu , & sregs - > ss , VCPU_SREG_SS ) ;
2007-11-01 22:16:10 +03:00
2008-05-27 12:18:46 +04:00
kvm_get_segment ( vcpu , & sregs - > tr , VCPU_SREG_TR ) ;
kvm_get_segment ( vcpu , & sregs - > ldt , VCPU_SREG_LDTR ) ;
2007-11-01 22:16:10 +03:00
kvm_x86_ops - > get_idt ( vcpu , & dt ) ;
2010-02-16 11:51:48 +03:00
sregs - > idt . limit = dt . size ;
sregs - > idt . base = dt . address ;
2007-11-01 22:16:10 +03:00
kvm_x86_ops - > get_gdt ( vcpu , & dt ) ;
2010-02-16 11:51:48 +03:00
sregs - > gdt . limit = dt . size ;
sregs - > gdt . base = dt . address ;
2007-11-01 22:16:10 +03:00
2009-12-29 19:07:30 +03:00
sregs - > cr0 = kvm_read_cr0 ( vcpu ) ;
2007-12-13 18:50:52 +03:00
sregs - > cr2 = vcpu - > arch . cr2 ;
2010-12-05 18:30:00 +03:00
sregs - > cr3 = kvm_read_cr3 ( vcpu ) ;
2009-12-07 13:16:48 +03:00
sregs - > cr4 = kvm_read_cr4 ( vcpu ) ;
2008-02-24 12:20:43 +03:00
sregs - > cr8 = kvm_get_cr8 ( vcpu ) ;
2010-01-21 16:31:50 +03:00
sregs - > efer = vcpu - > arch . efer ;
2007-11-01 22:16:10 +03:00
sregs - > apic_base = kvm_get_apic_base ( vcpu ) ;
2009-05-11 14:35:48 +04:00
memset ( sregs - > interrupt_bitmap , 0 , sizeof sregs - > interrupt_bitmap ) ;
2007-11-01 22:16:10 +03:00
2009-05-11 14:35:53 +04:00
if ( vcpu - > arch . interrupt . pending & & ! vcpu - > arch . interrupt . soft )
2009-04-21 18:45:11 +04:00
set_bit ( vcpu - > arch . interrupt . nr ,
( unsigned long * ) sregs - > interrupt_bitmap ) ;
2009-04-21 18:45:10 +04:00
2007-11-01 22:16:10 +03:00
return 0 ;
}
2008-04-11 20:24:45 +04:00
int kvm_arch_vcpu_ioctl_get_mpstate ( struct kvm_vcpu * vcpu ,
struct kvm_mp_state * mp_state )
{
mp_state - > mp_state = vcpu - > arch . mp_state ;
return 0 ;
}
int kvm_arch_vcpu_ioctl_set_mpstate ( struct kvm_vcpu * vcpu ,
struct kvm_mp_state * mp_state )
{
vcpu - > arch . mp_state = mp_state - > mp_state ;
2010-07-27 13:30:24 +04:00
kvm_make_request ( KVM_REQ_EVENT , vcpu ) ;
2008-04-11 20:24:45 +04:00
return 0 ;
}
2012-02-08 17:34:38 +04:00
int kvm_task_switch ( struct kvm_vcpu * vcpu , u16 tss_selector , int idt_index ,
int reason , bool has_error_code , u32 error_code )
2007-11-01 22:16:10 +03:00
{
2011-05-29 16:53:48 +04:00
struct x86_emulate_ctxt * ctxt = & vcpu - > arch . emulate_ctxt ;
2010-08-16 01:47:01 +04:00
int ret ;
2010-01-25 13:01:04 +03:00
2010-08-16 01:47:01 +04:00
init_emulate_ctxt ( vcpu ) ;
2010-02-18 13:15:01 +03:00
2012-02-08 17:34:38 +04:00
ret = emulator_task_switch ( ctxt , tss_selector , idt_index , reason ,
2011-05-29 16:53:48 +04:00
has_error_code , error_code ) ;
2010-02-18 13:15:01 +03:00
if ( ret )
2010-04-15 13:29:50 +04:00
return EMULATE_FAIL ;
2008-03-25 00:14:53 +03:00
2011-06-01 16:34:25 +04:00
memcpy ( vcpu - > arch . regs , ctxt - > regs , sizeof ctxt - > regs ) ;
2011-05-29 16:53:48 +04:00
kvm_rip_write ( vcpu , ctxt - > eip ) ;
kvm_set_rflags ( vcpu , ctxt - > eflags ) ;
2010-07-27 13:30:24 +04:00
kvm_make_request ( KVM_REQ_EVENT , vcpu ) ;
2010-04-15 13:29:50 +04:00
return EMULATE_DONE ;
2008-03-25 00:14:53 +03:00
}
EXPORT_SYMBOL_GPL ( kvm_task_switch ) ;
2007-11-01 22:16:10 +03:00
int kvm_arch_vcpu_ioctl_set_sregs ( struct kvm_vcpu * vcpu ,
struct kvm_sregs * sregs )
{
int mmu_reset_needed = 0 ;
2011-01-12 10:39:18 +03:00
int pending_vec , max_bits , idx ;
2010-02-16 11:51:48 +03:00
struct desc_ptr dt ;
2007-11-01 22:16:10 +03:00
2010-02-16 11:51:48 +03:00
dt . size = sregs - > idt . limit ;
dt . address = sregs - > idt . base ;
2007-11-01 22:16:10 +03:00
kvm_x86_ops - > set_idt ( vcpu , & dt ) ;
2010-02-16 11:51:48 +03:00
dt . size = sregs - > gdt . limit ;
dt . address = sregs - > gdt . base ;
2007-11-01 22:16:10 +03:00
kvm_x86_ops - > set_gdt ( vcpu , & dt ) ;
2007-12-13 18:50:52 +03:00
vcpu - > arch . cr2 = sregs - > cr2 ;
2010-12-05 18:30:00 +03:00
mmu_reset_needed | = kvm_read_cr3 ( vcpu ) ! = sregs - > cr3 ;
2009-07-01 22:52:03 +04:00
vcpu - > arch . cr3 = sregs - > cr3 ;
2010-12-05 19:56:11 +03:00
__set_bit ( VCPU_EXREG_CR3 , ( ulong * ) & vcpu - > arch . regs_avail ) ;
2007-11-01 22:16:10 +03:00
2008-02-24 12:20:43 +03:00
kvm_set_cr8 ( vcpu , sregs - > cr8 ) ;
2007-11-01 22:16:10 +03:00
2010-01-21 16:31:50 +03:00
mmu_reset_needed | = vcpu - > arch . efer ! = sregs - > efer ;
2007-11-01 22:16:10 +03:00
kvm_x86_ops - > set_efer ( vcpu , sregs - > efer ) ;
kvm_set_apic_base ( vcpu , sregs - > apic_base ) ;
2009-12-29 19:07:30 +03:00
mmu_reset_needed | = kvm_read_cr0 ( vcpu ) ! = sregs - > cr0 ;
2007-11-01 22:16:10 +03:00
kvm_x86_ops - > set_cr0 ( vcpu , sregs - > cr0 ) ;
2008-02-06 14:02:35 +03:00
vcpu - > arch . cr0 = sregs - > cr0 ;
2007-11-01 22:16:10 +03:00
2009-12-07 13:16:48 +03:00
mmu_reset_needed | = kvm_read_cr4 ( vcpu ) ! = sregs - > cr4 ;
2007-11-01 22:16:10 +03:00
kvm_x86_ops - > set_cr4 ( vcpu , sregs - > cr4 ) ;
2010-12-08 05:49:43 +03:00
if ( sregs - > cr4 & X86_CR4_OSXSAVE )
2011-11-23 18:30:32 +04:00
kvm_update_cpuid ( vcpu ) ;
2011-01-12 10:39:18 +03:00
idx = srcu_read_lock ( & vcpu - > kvm - > srcu ) ;
2009-10-26 21:48:33 +03:00
if ( ! is_long_mode ( vcpu ) & & is_pae ( vcpu ) ) {
2010-12-05 18:30:00 +03:00
load_pdptrs ( vcpu , vcpu - > arch . walk_mmu , kvm_read_cr3 ( vcpu ) ) ;
2009-10-26 21:48:33 +03:00
mmu_reset_needed = 1 ;
}
2011-01-12 10:39:18 +03:00
srcu_read_unlock ( & vcpu - > kvm - > srcu , idx ) ;
2007-11-01 22:16:10 +03:00
if ( mmu_reset_needed )
kvm_mmu_reset_context ( vcpu ) ;
2009-05-11 14:35:48 +04:00
max_bits = ( sizeof sregs - > interrupt_bitmap ) < < 3 ;
pending_vec = find_first_bit (
( const unsigned long * ) sregs - > interrupt_bitmap , max_bits ) ;
if ( pending_vec < max_bits ) {
2009-05-11 14:35:50 +04:00
kvm_queue_interrupt ( vcpu , pending_vec , false ) ;
2009-05-11 14:35:48 +04:00
pr_debug ( " Set back pending irq %d \n " , pending_vec ) ;
2007-11-01 22:16:10 +03:00
}
2008-05-27 12:18:46 +04:00
kvm_set_segment ( vcpu , & sregs - > cs , VCPU_SREG_CS ) ;
kvm_set_segment ( vcpu , & sregs - > ds , VCPU_SREG_DS ) ;
kvm_set_segment ( vcpu , & sregs - > es , VCPU_SREG_ES ) ;
kvm_set_segment ( vcpu , & sregs - > fs , VCPU_SREG_FS ) ;
kvm_set_segment ( vcpu , & sregs - > gs , VCPU_SREG_GS ) ;
kvm_set_segment ( vcpu , & sregs - > ss , VCPU_SREG_SS ) ;
2007-11-01 22:16:10 +03:00
2008-05-27 12:18:46 +04:00
kvm_set_segment ( vcpu , & sregs - > tr , VCPU_SREG_TR ) ;
kvm_set_segment ( vcpu , & sregs - > ldt , VCPU_SREG_LDTR ) ;
2007-11-01 22:16:10 +03:00
2009-08-03 15:58:25 +04:00
update_cr8_intercept ( vcpu ) ;
2008-09-10 23:40:55 +04:00
/* Older userspace won't unhalt the vcpu on reset. */
2009-06-09 16:56:26 +04:00
if ( kvm_vcpu_is_bsp ( vcpu ) & & kvm_rip_read ( vcpu ) = = 0xfff0 & &
2008-09-10 23:40:55 +04:00
sregs - > cs . selector = = 0xf000 & & sregs - > cs . base = = 0xffff0000 & &
2010-01-21 16:31:48 +03:00
! is_protmode ( vcpu ) )
2008-09-10 23:40:55 +04:00
vcpu - > arch . mp_state = KVM_MP_STATE_RUNNABLE ;
2010-07-27 13:30:24 +04:00
kvm_make_request ( KVM_REQ_EVENT , vcpu ) ;
2007-11-01 22:16:10 +03:00
return 0 ;
}
2008-12-15 15:52:10 +03:00
int kvm_arch_vcpu_ioctl_set_guest_debug ( struct kvm_vcpu * vcpu ,
struct kvm_guest_debug * dbg )
2007-11-01 22:16:10 +03:00
{
2009-10-03 02:31:21 +04:00
unsigned long rflags ;
2008-12-15 15:52:10 +03:00
int i , r ;
2007-11-01 22:16:10 +03:00
2009-10-30 14:46:59 +03:00
if ( dbg - > control & ( KVM_GUESTDBG_INJECT_DB | KVM_GUESTDBG_INJECT_BP ) ) {
r = - EBUSY ;
if ( vcpu - > arch . exception . pending )
2010-05-13 12:25:04 +04:00
goto out ;
2009-10-30 14:46:59 +03:00
if ( dbg - > control & KVM_GUESTDBG_INJECT_DB )
kvm_queue_exception ( vcpu , DB_VECTOR ) ;
else
kvm_queue_exception ( vcpu , BP_VECTOR ) ;
}
2009-10-05 15:07:21 +04:00
/*
* Read rflags as long as potentially injected trace flags are still
* filtered out .
*/
rflags = kvm_get_rflags ( vcpu ) ;
2009-10-03 02:31:21 +04:00
vcpu - > guest_debug = dbg - > control ;
if ( ! ( vcpu - > guest_debug & KVM_GUESTDBG_ENABLE ) )
vcpu - > guest_debug = 0 ;
if ( vcpu - > guest_debug & KVM_GUESTDBG_USE_HW_BP ) {
2008-12-15 15:52:10 +03:00
for ( i = 0 ; i < KVM_NR_DB_REGS ; + + i )
vcpu - > arch . eff_db [ i ] = dbg - > arch . debugreg [ i ] ;
vcpu - > arch . switch_db_regs =
( dbg - > arch . debugreg [ 7 ] & DR7_BP_EN_MASK ) ;
} else {
for ( i = 0 ; i < KVM_NR_DB_REGS ; i + + )
vcpu - > arch . eff_db [ i ] = vcpu - > arch . db [ i ] ;
vcpu - > arch . switch_db_regs = ( vcpu - > arch . dr7 & DR7_BP_EN_MASK ) ;
}
2010-02-23 19:47:55 +03:00
if ( vcpu - > guest_debug & KVM_GUESTDBG_SINGLESTEP )
vcpu - > arch . singlestep_rip = kvm_rip_read ( vcpu ) +
get_segment_base ( vcpu , VCPU_SREG_CS ) ;
2009-10-18 15:24:44 +04:00
2009-10-05 15:07:21 +04:00
/*
* Trigger an rflags update that will inject or remove the trace
* flags .
*/
kvm_set_rflags ( vcpu , rflags ) ;
2007-11-01 22:16:10 +03:00
2009-10-03 02:31:21 +04:00
kvm_x86_ops - > set_guest_debug ( vcpu , dbg ) ;
2007-11-01 22:16:10 +03:00
2009-10-30 14:46:59 +03:00
r = 0 ;
2008-12-15 15:52:10 +03:00
2010-05-13 12:25:04 +04:00
out :
2007-11-01 22:16:10 +03:00
return r ;
}
2007-11-16 08:05:55 +03:00
/*
* Translate a guest virtual address to a guest physical address .
*/
int kvm_arch_vcpu_ioctl_translate ( struct kvm_vcpu * vcpu ,
struct kvm_translation * tr )
{
unsigned long vaddr = tr - > linear_address ;
gpa_t gpa ;
2009-12-23 19:35:25 +03:00
int idx ;
2007-11-16 08:05:55 +03:00
2009-12-23 19:35:25 +03:00
idx = srcu_read_lock ( & vcpu - > kvm - > srcu ) ;
2010-02-10 15:21:32 +03:00
gpa = kvm_mmu_gva_to_gpa_system ( vcpu , vaddr , NULL ) ;
2009-12-23 19:35:25 +03:00
srcu_read_unlock ( & vcpu - > kvm - > srcu , idx ) ;
2007-11-16 08:05:55 +03:00
tr - > physical_address = gpa ;
tr - > valid = gpa ! = UNMAPPED_GVA ;
tr - > writeable = 1 ;
tr - > usermode = 0 ;
return 0 ;
}
2007-11-01 01:24:25 +03:00
int kvm_arch_vcpu_ioctl_get_fpu ( struct kvm_vcpu * vcpu , struct kvm_fpu * fpu )
{
2010-05-17 13:08:28 +04:00
struct i387_fxsave_struct * fxsave =
& vcpu - > arch . guest_fpu . state - > fxsave ;
2007-11-01 01:24:25 +03:00
memcpy ( fpu - > fpr , fxsave - > st_space , 128 ) ;
fpu - > fcw = fxsave - > cwd ;
fpu - > fsw = fxsave - > swd ;
fpu - > ftwx = fxsave - > twd ;
fpu - > last_opcode = fxsave - > fop ;
fpu - > last_ip = fxsave - > rip ;
fpu - > last_dp = fxsave - > rdp ;
memcpy ( fpu - > xmm , fxsave - > xmm_space , sizeof fxsave - > xmm_space ) ;
return 0 ;
}
int kvm_arch_vcpu_ioctl_set_fpu ( struct kvm_vcpu * vcpu , struct kvm_fpu * fpu )
{
2010-05-17 13:08:28 +04:00
struct i387_fxsave_struct * fxsave =
& vcpu - > arch . guest_fpu . state - > fxsave ;
2007-11-01 01:24:25 +03:00
memcpy ( fxsave - > st_space , fpu - > fpr , 128 ) ;
fxsave - > cwd = fpu - > fcw ;
fxsave - > swd = fpu - > fsw ;
fxsave - > twd = fpu - > ftwx ;
fxsave - > fop = fpu - > last_opcode ;
fxsave - > rip = fpu - > last_ip ;
fxsave - > rdp = fpu - > last_dp ;
memcpy ( fxsave - > xmm_space , fpu - > xmm , sizeof fxsave - > xmm_space ) ;
return 0 ;
}
2010-05-25 18:01:50 +04:00
int fx_init ( struct kvm_vcpu * vcpu )
2007-11-01 01:24:25 +03:00
{
2010-05-25 18:01:50 +04:00
int err ;
err = fpu_alloc ( & vcpu - > arch . guest_fpu ) ;
if ( err )
return err ;
2010-05-17 13:08:28 +04:00
fpu_finit ( & vcpu - > arch . guest_fpu ) ;
2007-11-01 01:24:25 +03:00
2010-06-10 07:27:12 +04:00
/*
* Ensure guest xcr0 is valid for loading
*/
vcpu - > arch . xcr0 = XSTATE_FP ;
2007-12-13 18:50:52 +03:00
vcpu - > arch . cr0 | = X86_CR0_ET ;
2010-05-25 18:01:50 +04:00
return 0 ;
2007-11-01 01:24:25 +03:00
}
EXPORT_SYMBOL_GPL ( fx_init ) ;
2010-05-17 13:08:28 +04:00
static void fx_free ( struct kvm_vcpu * vcpu )
{
fpu_free ( & vcpu - > arch . guest_fpu ) ;
}
2007-11-01 01:24:25 +03:00
void kvm_load_guest_fpu ( struct kvm_vcpu * vcpu )
{
2010-01-21 16:31:45 +03:00
if ( vcpu - > guest_fpu_loaded )
2007-11-01 01:24:25 +03:00
return ;
2010-06-10 07:27:12 +04:00
/*
* Restore all possible states in the guest ,
* and assume host would use all available bits .
* Guest xcr0 would be loaded later .
*/
kvm_put_guest_xcr0 ( vcpu ) ;
2007-11-01 01:24:25 +03:00
vcpu - > guest_fpu_loaded = 1 ;
2012-09-20 22:01:49 +04:00
__kernel_fpu_begin ( ) ;
2010-05-17 13:08:28 +04:00
fpu_restore_checking ( & vcpu - > arch . guest_fpu ) ;
2010-01-21 16:31:52 +03:00
trace_kvm_fpu ( 1 ) ;
2007-11-01 01:24:25 +03:00
}
void kvm_put_guest_fpu ( struct kvm_vcpu * vcpu )
{
2010-06-10 07:27:12 +04:00
kvm_put_guest_xcr0 ( vcpu ) ;
2007-11-01 01:24:25 +03:00
if ( ! vcpu - > guest_fpu_loaded )
return ;
vcpu - > guest_fpu_loaded = 0 ;
2010-05-17 13:08:28 +04:00
fpu_save_init ( & vcpu - > arch . guest_fpu ) ;
2012-09-20 22:01:49 +04:00
__kernel_fpu_end ( ) ;
2007-11-18 14:54:33 +03:00
+ + vcpu - > stat . fpu_reload ;
2010-05-10 13:34:53 +04:00
kvm_make_request ( KVM_REQ_DEACTIVATE_FPU , vcpu ) ;
2010-01-21 16:31:52 +03:00
trace_kvm_fpu ( 0 ) ;
2007-11-01 01:24:25 +03:00
}
2007-11-14 15:38:21 +03:00
void kvm_arch_vcpu_free ( struct kvm_vcpu * vcpu )
{
2011-02-01 22:16:40 +03:00
kvmclock_reset ( vcpu ) ;
2009-02-25 18:08:31 +03:00
2010-06-30 08:25:15 +04:00
free_cpumask_var ( vcpu - > arch . wbinvd_dirty_mask ) ;
2010-05-17 13:08:28 +04:00
fx_free ( vcpu ) ;
2007-11-14 15:38:21 +03:00
kvm_x86_ops - > vcpu_free ( vcpu ) ;
}
struct kvm_vcpu * kvm_arch_vcpu_create ( struct kvm * kvm ,
unsigned int id )
{
2010-08-20 12:07:22 +04:00
if ( check_tsc_unstable ( ) & & atomic_read ( & kvm - > online_vcpus ) ! = 0 )
printk_once ( KERN_WARNING
" kvm: SMP vm created on host with unstable TSC; "
" guest TSC will not be reliable \n " ) ;
2007-11-20 16:30:24 +03:00
return kvm_x86_ops - > vcpu_create ( kvm , id ) ;
}
2007-11-14 15:38:21 +03:00
2007-11-20 16:30:24 +03:00
int kvm_arch_vcpu_setup ( struct kvm_vcpu * vcpu )
{
int r ;
2007-11-14 15:38:21 +03:00
2008-10-09 12:01:54 +04:00
vcpu - > arch . mtrr_state . have_fixed = 1 ;
2007-11-14 15:38:21 +03:00
vcpu_load ( vcpu ) ;
r = kvm_arch_vcpu_reset ( vcpu ) ;
if ( r = = 0 )
r = kvm_mmu_setup ( vcpu ) ;
vcpu_put ( vcpu ) ;
2007-11-20 16:30:24 +03:00
return r ;
2007-11-14 15:38:21 +03:00
}
2007-11-19 23:04:43 +03:00
void kvm_arch_vcpu_destroy ( struct kvm_vcpu * vcpu )
2007-11-14 15:38:21 +03:00
{
2010-10-14 13:22:50 +04:00
vcpu - > arch . apf . msr_val = 0 ;
2007-11-14 15:38:21 +03:00
vcpu_load ( vcpu ) ;
kvm_mmu_unload ( vcpu ) ;
vcpu_put ( vcpu ) ;
2010-05-17 13:08:28 +04:00
fx_free ( vcpu ) ;
2007-11-14 15:38:21 +03:00
kvm_x86_ops - > vcpu_free ( vcpu ) ;
}
int kvm_arch_vcpu_reset ( struct kvm_vcpu * vcpu )
{
2011-09-20 14:43:14 +04:00
atomic_set ( & vcpu - > arch . nmi_queued , 0 ) ;
vcpu - > arch . nmi_pending = 0 ;
2008-09-26 11:30:48 +04:00
vcpu - > arch . nmi_injected = false ;
2008-12-15 15:52:10 +03:00
vcpu - > arch . switch_db_regs = 0 ;
memset ( vcpu - > arch . db , 0 , sizeof ( vcpu - > arch . db ) ) ;
vcpu - > arch . dr6 = DR6_FIXED_1 ;
vcpu - > arch . dr7 = DR7_FIXED_1 ;
2010-07-27 13:30:24 +04:00
kvm_make_request ( KVM_REQ_EVENT , vcpu ) ;
2010-10-14 13:22:50 +04:00
vcpu - > arch . apf . msr_val = 0 ;
2011-07-11 23:28:14 +04:00
vcpu - > arch . st . msr_val = 0 ;
2010-07-27 13:30:24 +04:00
2011-02-01 22:16:40 +03:00
kvmclock_reset ( vcpu ) ;
2010-10-14 13:22:46 +04:00
kvm_clear_async_pf_completion_queue ( vcpu ) ;
kvm_async_pf_hash_reset ( vcpu ) ;
vcpu - > arch . apf . halted = false ;
2010-07-27 13:30:24 +04:00
2011-11-10 16:57:22 +04:00
kvm_pmu_reset ( vcpu ) ;
2007-11-14 15:38:21 +03:00
return kvm_x86_ops - > vcpu_reset ( vcpu ) ;
}
2009-09-15 13:37:46 +04:00
int kvm_arch_hardware_enable ( void * garbage )
2007-11-14 15:38:21 +03:00
{
2010-08-20 12:07:28 +04:00
struct kvm * kvm ;
struct kvm_vcpu * vcpu ;
int i ;
2012-02-03 21:43:56 +04:00
int ret ;
u64 local_tsc ;
u64 max_tsc = 0 ;
bool stable , backwards_tsc = false ;
2009-09-07 12:12:18 +04:00
kvm_shared_msr_cpu_online ( ) ;
2012-02-03 21:43:56 +04:00
ret = kvm_x86_ops - > hardware_enable ( garbage ) ;
if ( ret ! = 0 )
return ret ;
local_tsc = native_read_tsc ( ) ;
stable = ! check_tsc_unstable ( ) ;
list_for_each_entry ( kvm , & vm_list , vm_list ) {
kvm_for_each_vcpu ( i , vcpu , kvm ) {
if ( ! stable & & vcpu - > cpu = = smp_processor_id ( ) )
set_bit ( KVM_REQ_CLOCK_UPDATE , & vcpu - > requests ) ;
if ( stable & & vcpu - > arch . last_host_tsc > local_tsc ) {
backwards_tsc = true ;
if ( vcpu - > arch . last_host_tsc > max_tsc )
max_tsc = vcpu - > arch . last_host_tsc ;
}
}
}
/*
* Sometimes , even reliable TSCs go backwards . This happens on
* platforms that reset TSC during suspend or hibernate actions , but
* maintain synchronization . We must compensate . Fortunately , we can
* detect that condition here , which happens early in CPU bringup ,
* before any KVM threads can be running . Unfortunately , we can ' t
* bring the TSCs fully up to date with real time , as we aren ' t yet far
* enough into CPU bringup that we know how much real time has actually
* elapsed ; our helper function , get_kernel_ns ( ) will be using boot
* variables that haven ' t been updated yet .
*
* So we simply find the maximum observed TSC above , then record the
* adjustment to TSC in each VCPU . When the VCPU later gets loaded ,
* the adjustment will be applied . Note that we accumulate
* adjustments , in case multiple suspend cycles happen before some VCPU
* gets a chance to run again . In the event that no KVM threads get a
* chance to run , we will miss the entire elapsed period , as we ' ll have
* reset last_host_tsc , so VCPUs will not have the TSC adjusted and may
* loose cycle time . This isn ' t too big a deal , since the loss will be
* uniform across all VCPUs ( not to mention the scenario is extremely
* unlikely ) . It is possible that a second hibernate recovery happens
* much faster than a first , causing the observed TSC here to be
* smaller ; this would require additional padding adjustment , which is
* why we set last_host_tsc to the local tsc observed here .
*
* N . B . - this code below runs only on platforms with reliable TSC ,
* as that is the only way backwards_tsc is set above . Also note
* that this runs for ALL vcpus , which is not a bug ; all VCPUs should
* have the same delta_cyc adjustment applied if backwards_tsc
* is detected . Note further , this adjustment is only done once ,
* as we reset last_host_tsc on all VCPUs to stop this from being
* called multiple times ( one for each physical CPU bringup ) .
*
* Platforms with unnreliable TSCs don ' t have to deal with this , they
* will be compensated by the logic in vcpu_load , which sets the TSC to
* catchup mode . This will catchup all VCPUs to real time , but cannot
* guarantee that they stay in perfect synchronization .
*/
if ( backwards_tsc ) {
u64 delta_cyc = max_tsc - local_tsc ;
list_for_each_entry ( kvm , & vm_list , vm_list ) {
kvm_for_each_vcpu ( i , vcpu , kvm ) {
vcpu - > arch . tsc_offset_adjustment + = delta_cyc ;
vcpu - > arch . last_host_tsc = local_tsc ;
}
/*
* We have to disable TSC offset matching . . if you were
* booting a VM while issuing an S4 host suspend . . . .
* you may have some problem . Solving this issue is
* left as an exercise to the reader .
*/
kvm - > arch . last_tsc_nsec = 0 ;
kvm - > arch . last_tsc_write = 0 ;
}
}
return 0 ;
2007-11-14 15:38:21 +03:00
}
void kvm_arch_hardware_disable ( void * garbage )
{
kvm_x86_ops - > hardware_disable ( garbage ) ;
2009-11-28 15:18:47 +03:00
drop_user_return_notifiers ( garbage ) ;
2007-11-14 15:38:21 +03:00
}
int kvm_arch_hardware_setup ( void )
{
return kvm_x86_ops - > hardware_setup ( ) ;
}
void kvm_arch_hardware_unsetup ( void )
{
kvm_x86_ops - > hardware_unsetup ( ) ;
}
void kvm_arch_check_processor_compat ( void * rtn )
{
kvm_x86_ops - > check_processor_compatibility ( rtn ) ;
}
2012-03-05 16:23:29 +04:00
bool kvm_vcpu_compatible ( struct kvm_vcpu * vcpu )
{
return irqchip_in_kernel ( vcpu - > kvm ) = = ( vcpu - > arch . apic ! = NULL ) ;
}
2007-11-14 15:38:21 +03:00
int kvm_arch_vcpu_init ( struct kvm_vcpu * vcpu )
{
struct page * page ;
struct kvm * kvm ;
int r ;
BUG_ON ( vcpu - > kvm = = NULL ) ;
kvm = vcpu - > kvm ;
2010-07-29 16:11:50 +04:00
vcpu - > arch . emulate_ctxt . ops = & emulate_ops ;
2009-06-09 16:56:26 +04:00
if ( ! irqchip_in_kernel ( kvm ) | | kvm_vcpu_is_bsp ( vcpu ) )
2008-04-13 18:54:35 +04:00
vcpu - > arch . mp_state = KVM_MP_STATE_RUNNABLE ;
2007-11-14 15:38:21 +03:00
else
2008-04-13 18:54:35 +04:00
vcpu - > arch . mp_state = KVM_MP_STATE_UNINITIALIZED ;
2007-11-14 15:38:21 +03:00
page = alloc_page ( GFP_KERNEL | __GFP_ZERO ) ;
if ( ! page ) {
r = - ENOMEM ;
goto fail ;
}
2007-12-13 18:50:52 +03:00
vcpu - > arch . pio_data = page_address ( page ) ;
2007-11-14 15:38:21 +03:00
KVM: Infrastructure for software and hardware based TSC rate scaling
This requires some restructuring; rather than use 'virtual_tsc_khz'
to indicate whether hardware rate scaling is in effect, we consider
each VCPU to always have a virtual TSC rate. Instead, there is new
logic above the vendor-specific hardware scaling that decides whether
it is even necessary to use and updates all rate variables used by
common code. This means we can simply query the virtual rate at
any point, which is needed for software rate scaling.
There is also now a threshold added to the TSC rate scaling; minor
differences and variations of measured TSC rate can accidentally
provoke rate scaling to be used when it is not needed. Instead,
we have a tolerance variable called tsc_tolerance_ppm, which is
the maximum variation from user requested rate at which scaling
will be used. The default is 250ppm, which is the half the
threshold for NTP adjustment, allowing for some hardware variation.
In the event that hardware rate scaling is not available, we can
kludge a bit by forcing TSC catchup to turn on when a faster than
hardware speed has been requested, but there is nothing available
yet for the reverse case; this requires a trap and emulate software
implementation for RDTSC, which is still forthcoming.
[avi: fix 64-bit division on i386]
Signed-off-by: Zachary Amsden <zamsden@gmail.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
2012-02-03 21:43:50 +04:00
kvm_set_tsc_khz ( vcpu , max_tsc_khz ) ;
2010-09-19 04:38:15 +04:00
2007-11-14 15:38:21 +03:00
r = kvm_mmu_create ( vcpu ) ;
if ( r < 0 )
goto fail_free_pio_data ;
if ( irqchip_in_kernel ( kvm ) ) {
r = kvm_create_lapic ( vcpu ) ;
if ( r < 0 )
goto fail_mmu_destroy ;
}
2009-05-11 12:48:15 +04:00
vcpu - > arch . mce_banks = kzalloc ( KVM_MAX_MCE_BANKS * sizeof ( u64 ) * 4 ,
GFP_KERNEL ) ;
if ( ! vcpu - > arch . mce_banks ) {
r = - ENOMEM ;
2010-01-22 09:21:29 +03:00
goto fail_free_lapic ;
2009-05-11 12:48:15 +04:00
}
vcpu - > arch . mcg_cap = KVM_MAX_MCE_BANKS ;
2010-06-30 08:25:15 +04:00
if ( ! zalloc_cpumask_var ( & vcpu - > arch . wbinvd_dirty_mask , GFP_KERNEL ) )
goto fail_free_mce_banks ;
2010-10-14 13:22:46 +04:00
kvm_async_pf_hash_reset ( vcpu ) ;
2011-11-10 16:57:22 +04:00
kvm_pmu_init ( vcpu ) ;
2010-10-14 13:22:46 +04:00
2007-11-14 15:38:21 +03:00
return 0 ;
2010-06-30 08:25:15 +04:00
fail_free_mce_banks :
kfree ( vcpu - > arch . mce_banks ) ;
2010-01-22 09:21:29 +03:00
fail_free_lapic :
kvm_free_lapic ( vcpu ) ;
2007-11-14 15:38:21 +03:00
fail_mmu_destroy :
kvm_mmu_destroy ( vcpu ) ;
fail_free_pio_data :
2007-12-13 18:50:52 +03:00
free_page ( ( unsigned long ) vcpu - > arch . pio_data ) ;
2007-11-14 15:38:21 +03:00
fail :
return r ;
}
void kvm_arch_vcpu_uninit ( struct kvm_vcpu * vcpu )
{
2009-12-23 19:35:25 +03:00
int idx ;
2011-11-10 16:57:22 +04:00
kvm_pmu_destroy ( vcpu ) ;
2010-01-22 09:18:47 +03:00
kfree ( vcpu - > arch . mce_banks ) ;
2007-11-14 15:38:21 +03:00
kvm_free_lapic ( vcpu ) ;
2009-12-23 19:35:25 +03:00
idx = srcu_read_lock ( & vcpu - > kvm - > srcu ) ;
2007-11-14 15:38:21 +03:00
kvm_mmu_destroy ( vcpu ) ;
2009-12-23 19:35:25 +03:00
srcu_read_unlock ( & vcpu - > kvm - > srcu , idx ) ;
2007-12-13 18:50:52 +03:00
free_page ( ( unsigned long ) vcpu - > arch . pio_data ) ;
2007-11-14 15:38:21 +03:00
}
2007-11-18 13:43:45 +03:00
2012-01-04 13:25:20 +04:00
int kvm_arch_init_vm ( struct kvm * kvm , unsigned long type )
2007-11-18 13:43:45 +03:00
{
2012-01-04 13:25:20 +04:00
if ( type )
return - EINVAL ;
2007-12-14 05:01:48 +03:00
INIT_LIST_HEAD ( & kvm - > arch . active_mmu_pages ) ;
2008-07-28 20:26:26 +04:00
INIT_LIST_HEAD ( & kvm - > arch . assigned_dev_head ) ;
2007-11-18 13:43:45 +03:00
2008-10-15 16:15:06 +04:00
/* Reserve bit 0 of irq_sources_bitmap for userspace irq source */
set_bit ( KVM_USERSPACE_IRQ_SOURCE_ID , & kvm - > arch . irq_sources_bitmap ) ;
2011-02-04 12:49:11 +03:00
raw_spin_lock_init ( & kvm - > arch . tsc_write_lock ) ;
2008-12-11 22:45:05 +03:00
2010-11-09 19:02:49 +03:00
return 0 ;
2007-11-18 13:43:45 +03:00
}
static void kvm_unload_vcpu_mmu ( struct kvm_vcpu * vcpu )
{
vcpu_load ( vcpu ) ;
kvm_mmu_unload ( vcpu ) ;
vcpu_put ( vcpu ) ;
}
static void kvm_free_vcpus ( struct kvm * kvm )
{
unsigned int i ;
2009-06-09 16:56:29 +04:00
struct kvm_vcpu * vcpu ;
2007-11-18 13:43:45 +03:00
/*
* Unpin any mmu pages first .
*/
2010-10-14 13:22:46 +04:00
kvm_for_each_vcpu ( i , vcpu , kvm ) {
kvm_clear_async_pf_completion_queue ( vcpu ) ;
2009-06-09 16:56:29 +04:00
kvm_unload_vcpu_mmu ( vcpu ) ;
2010-10-14 13:22:46 +04:00
}
2009-06-09 16:56:29 +04:00
kvm_for_each_vcpu ( i , vcpu , kvm )
kvm_arch_vcpu_free ( vcpu ) ;
mutex_lock ( & kvm - > lock ) ;
for ( i = 0 ; i < atomic_read ( & kvm - > online_vcpus ) ; i + + )
kvm - > vcpus [ i ] = NULL ;
2007-11-18 13:43:45 +03:00
2009-06-09 16:56:29 +04:00
atomic_set ( & kvm - > online_vcpus , 0 ) ;
mutex_unlock ( & kvm - > lock ) ;
2007-11-18 13:43:45 +03:00
}
2009-01-06 05:03:02 +03:00
void kvm_arch_sync_events ( struct kvm * kvm )
{
2009-01-06 05:03:03 +03:00
kvm_free_all_assigned_devices ( kvm ) ;
2010-07-10 13:37:56 +04:00
kvm_free_pit ( kvm ) ;
2009-01-06 05:03:02 +03:00
}
2007-11-18 13:43:45 +03:00
void kvm_arch_destroy_vm ( struct kvm * kvm )
{
2008-10-31 07:37:41 +03:00
kvm_iommu_unmap_guest ( kvm ) ;
2007-12-14 05:17:34 +03:00
kfree ( kvm - > arch . vpic ) ;
kfree ( kvm - > arch . vioapic ) ;
2007-11-18 13:43:45 +03:00
kvm_free_vcpus ( kvm ) ;
2008-03-25 12:26:13 +03:00
if ( kvm - > arch . apic_access_page )
put_page ( kvm - > arch . apic_access_page ) ;
2008-04-25 17:44:52 +04:00
if ( kvm - > arch . ept_identity_pagetable )
put_page ( kvm - > arch . ept_identity_pagetable ) ;
2007-11-18 13:43:45 +03:00
}
2007-11-20 11:25:04 +03:00
2012-02-08 08:02:18 +04:00
void kvm_arch_free_memslot ( struct kvm_memory_slot * free ,
struct kvm_memory_slot * dont )
{
int i ;
for ( i = 0 ; i < KVM_NR_PAGE_SIZES - 1 ; + + i ) {
if ( ! dont | | free - > arch . lpage_info [ i ] ! = dont - > arch . lpage_info [ i ] ) {
2012-05-20 08:15:07 +04:00
kvm_kvfree ( free - > arch . lpage_info [ i ] ) ;
2012-02-08 08:02:18 +04:00
free - > arch . lpage_info [ i ] = NULL ;
}
}
}
int kvm_arch_create_memslot ( struct kvm_memory_slot * slot , unsigned long npages )
{
int i ;
for ( i = 0 ; i < KVM_NR_PAGE_SIZES - 1 ; + + i ) {
unsigned long ugfn ;
int lpages ;
int level = i + 2 ;
lpages = gfn_to_index ( slot - > base_gfn + npages - 1 ,
slot - > base_gfn , level ) + 1 ;
slot - > arch . lpage_info [ i ] =
2012-05-20 08:15:07 +04:00
kvm_kvzalloc ( lpages * sizeof ( * slot - > arch . lpage_info [ i ] ) ) ;
2012-02-08 08:02:18 +04:00
if ( ! slot - > arch . lpage_info [ i ] )
goto out_free ;
if ( slot - > base_gfn & ( KVM_PAGES_PER_HPAGE ( level ) - 1 ) )
slot - > arch . lpage_info [ i ] [ 0 ] . write_count = 1 ;
if ( ( slot - > base_gfn + npages ) & ( KVM_PAGES_PER_HPAGE ( level ) - 1 ) )
slot - > arch . lpage_info [ i ] [ lpages - 1 ] . write_count = 1 ;
ugfn = slot - > userspace_addr > > PAGE_SHIFT ;
/*
* If the gfn and userspace address are not aligned wrt each
* other , or if explicitly asked to , disable large page
* support for this slot
*/
if ( ( slot - > base_gfn ^ ugfn ) & ( KVM_PAGES_PER_HPAGE ( level ) - 1 ) | |
! kvm_largepages_enabled ( ) ) {
unsigned long j ;
for ( j = 0 ; j < lpages ; + + j )
slot - > arch . lpage_info [ i ] [ j ] . write_count = 1 ;
}
}
return 0 ;
out_free :
for ( i = 0 ; i < KVM_NR_PAGE_SIZES - 1 ; + + i ) {
2012-06-19 17:04:56 +04:00
kvm_kvfree ( slot - > arch . lpage_info [ i ] ) ;
2012-02-08 08:02:18 +04:00
slot - > arch . lpage_info [ i ] = NULL ;
}
return - ENOMEM ;
}
2009-12-23 19:35:18 +03:00
int kvm_arch_prepare_memory_region ( struct kvm * kvm ,
struct kvm_memory_slot * memslot ,
2007-11-20 11:25:04 +03:00
struct kvm_memory_slot old ,
2009-12-23 19:35:18 +03:00
struct kvm_userspace_memory_region * mem ,
2007-11-20 11:25:04 +03:00
int user_alloc )
{
2009-12-23 19:35:18 +03:00
int npages = memslot - > npages ;
2010-06-21 11:57:45 +04:00
int map_flags = MAP_PRIVATE | MAP_ANONYMOUS ;
/* Prevent internal slot pages from being moved by fork()/COW. */
if ( memslot - > id > = KVM_MEMORY_SLOTS )
map_flags = MAP_SHARED | MAP_ANONYMOUS ;
2007-11-20 11:25:04 +03:00
/*To keep backward compatibility with older userspace,
* x86 needs to hanlde ! user_alloc case .
*/
if ( ! user_alloc ) {
if ( npages & & ! old . rmap ) {
2008-07-25 18:32:03 +04:00
unsigned long userspace_addr ;
2012-04-21 04:13:58 +04:00
userspace_addr = vm_mmap ( NULL , 0 ,
2008-07-25 18:32:03 +04:00
npages * PAGE_SIZE ,
PROT_READ | PROT_WRITE ,
2010-06-21 11:57:45 +04:00
map_flags ,
2008-07-25 18:32:03 +04:00
0 ) ;
2007-11-20 11:25:04 +03:00
2008-07-25 18:32:03 +04:00
if ( IS_ERR ( ( void * ) userspace_addr ) )
return PTR_ERR ( ( void * ) userspace_addr ) ;
memslot - > userspace_addr = userspace_addr ;
2007-11-20 11:25:04 +03:00
}
}
2009-12-23 19:35:18 +03:00
return 0 ;
}
void kvm_arch_commit_memory_region ( struct kvm * kvm ,
struct kvm_userspace_memory_region * mem ,
struct kvm_memory_slot old ,
int user_alloc )
{
2011-03-04 13:59:21 +03:00
int nr_mmu_pages = 0 , npages = mem - > memory_size > > PAGE_SHIFT ;
2009-12-23 19:35:18 +03:00
if ( ! user_alloc & & ! old . user_alloc & & old . rmap & & ! npages ) {
int ret ;
2012-04-21 05:57:04 +04:00
ret = vm_munmap ( old . userspace_addr ,
2009-12-23 19:35:18 +03:00
old . npages * PAGE_SIZE ) ;
if ( ret < 0 )
printk ( KERN_WARNING
" kvm_vm_ioctl_set_memory_region: "
" failed to munmap memory \n " ) ;
}
2011-03-04 13:59:21 +03:00
if ( ! kvm - > arch . n_requested_mmu_pages )
nr_mmu_pages = kvm_mmu_calculate_mmu_pages ( kvm ) ;
2009-05-13 01:55:43 +04:00
spin_lock ( & kvm - > mmu_lock ) ;
2011-03-04 13:59:21 +03:00
if ( nr_mmu_pages )
2007-11-20 11:25:04 +03:00
kvm_mmu_change_mmu_pages ( kvm , nr_mmu_pages ) ;
kvm_mmu_slot_remove_write_access ( kvm , mem - > slot ) ;
2009-05-13 01:55:43 +04:00
spin_unlock ( & kvm - > mmu_lock ) ;
2007-11-20 11:25:04 +03:00
}
2007-12-14 04:35:10 +03:00
2008-07-11 03:49:31 +04:00
void kvm_arch_flush_shadow ( struct kvm * kvm )
{
kvm_mmu_zap_all ( kvm ) ;
2009-05-13 01:55:45 +04:00
kvm_reload_remote_mmus ( kvm ) ;
2008-07-11 03:49:31 +04:00
}
2007-12-14 04:35:10 +03:00
int kvm_arch_vcpu_runnable ( struct kvm_vcpu * vcpu )
{
2010-10-14 13:22:46 +04:00
return ( vcpu - > arch . mp_state = = KVM_MP_STATE_RUNNABLE & &
! vcpu - > arch . apf . halted )
| | ! list_empty_careful ( & vcpu - > async_pf . done )
2009-07-09 16:33:52 +04:00
| | vcpu - > arch . mp_state = = KVM_MP_STATE_SIPI_RECEIVED
2011-09-20 14:43:14 +04:00
| | atomic_read ( & vcpu - > arch . nmi_queued ) | |
2009-07-09 16:33:52 +04:00
( kvm_arch_interrupt_allowed ( vcpu ) & &
kvm_cpu_has_interrupt ( vcpu ) ) ;
2007-12-14 04:35:10 +03:00
}
2007-12-17 09:21:40 +03:00
2012-03-09 01:44:24 +04:00
int kvm_arch_vcpu_should_kick ( struct kvm_vcpu * vcpu )
2007-12-17 09:21:40 +03:00
{
2012-03-09 01:44:24 +04:00
return kvm_vcpu_exiting_guest_mode ( vcpu ) = = IN_GUEST_MODE ;
2007-12-17 09:21:40 +03:00
}
2009-03-23 13:12:11 +03:00
int kvm_arch_interrupt_allowed ( struct kvm_vcpu * vcpu )
{
return kvm_x86_ops - > interrupt_allowed ( vcpu ) ;
}
2009-06-17 16:22:14 +04:00
2010-02-23 19:47:55 +03:00
bool kvm_is_linear_rip ( struct kvm_vcpu * vcpu , unsigned long linear_rip )
{
unsigned long current_rip = kvm_rip_read ( vcpu ) +
get_segment_base ( vcpu , VCPU_SREG_CS ) ;
return current_rip = = linear_rip ;
}
EXPORT_SYMBOL_GPL ( kvm_is_linear_rip ) ;
2009-10-18 15:24:44 +04:00
unsigned long kvm_get_rflags ( struct kvm_vcpu * vcpu )
{
unsigned long rflags ;
rflags = kvm_x86_ops - > get_rflags ( vcpu ) ;
if ( vcpu - > guest_debug & KVM_GUESTDBG_SINGLESTEP )
2010-02-23 19:47:58 +03:00
rflags & = ~ X86_EFLAGS_TF ;
2009-10-18 15:24:44 +04:00
return rflags ;
}
EXPORT_SYMBOL_GPL ( kvm_get_rflags ) ;
void kvm_set_rflags ( struct kvm_vcpu * vcpu , unsigned long rflags )
{
if ( vcpu - > guest_debug & KVM_GUESTDBG_SINGLESTEP & &
2010-02-23 19:47:55 +03:00
kvm_is_linear_rip ( vcpu , vcpu - > arch . singlestep_rip ) )
2010-02-23 19:47:58 +03:00
rflags | = X86_EFLAGS_TF ;
2009-10-18 15:24:44 +04:00
kvm_x86_ops - > set_rflags ( vcpu , rflags ) ;
2010-07-27 13:30:24 +04:00
kvm_make_request ( KVM_REQ_EVENT , vcpu ) ;
2009-10-18 15:24:44 +04:00
}
EXPORT_SYMBOL_GPL ( kvm_set_rflags ) ;
2010-10-17 20:13:42 +04:00
void kvm_arch_async_page_ready ( struct kvm_vcpu * vcpu , struct kvm_async_pf * work )
{
int r ;
2010-12-07 05:35:25 +03:00
if ( ( vcpu - > arch . mmu . direct_map ! = work - > arch . direct_map ) | |
2010-11-12 09:49:55 +03:00
is_error_page ( work - > page ) )
2010-10-17 20:13:42 +04:00
return ;
r = kvm_mmu_reload ( vcpu ) ;
if ( unlikely ( r ) )
return ;
2010-12-07 05:35:25 +03:00
if ( ! vcpu - > arch . mmu . direct_map & &
work - > arch . cr3 ! = vcpu - > arch . mmu . get_cr3 ( vcpu ) )
return ;
2010-10-17 20:13:42 +04:00
vcpu - > arch . mmu . page_fault ( vcpu , work - > gva , 0 , true ) ;
}
2010-10-14 13:22:46 +04:00
static inline u32 kvm_async_pf_hash_fn ( gfn_t gfn )
{
return hash_32 ( gfn & 0xffffffff , order_base_2 ( ASYNC_PF_PER_VCPU ) ) ;
}
static inline u32 kvm_async_pf_next_probe ( u32 key )
{
return ( key + 1 ) & ( roundup_pow_of_two ( ASYNC_PF_PER_VCPU ) - 1 ) ;
}
static void kvm_add_async_pf_gfn ( struct kvm_vcpu * vcpu , gfn_t gfn )
{
u32 key = kvm_async_pf_hash_fn ( gfn ) ;
while ( vcpu - > arch . apf . gfns [ key ] ! = ~ 0 )
key = kvm_async_pf_next_probe ( key ) ;
vcpu - > arch . apf . gfns [ key ] = gfn ;
}
static u32 kvm_async_pf_gfn_slot ( struct kvm_vcpu * vcpu , gfn_t gfn )
{
int i ;
u32 key = kvm_async_pf_hash_fn ( gfn ) ;
for ( i = 0 ; i < roundup_pow_of_two ( ASYNC_PF_PER_VCPU ) & &
2010-11-01 12:00:30 +03:00
( vcpu - > arch . apf . gfns [ key ] ! = gfn & &
vcpu - > arch . apf . gfns [ key ] ! = ~ 0 ) ; i + + )
2010-10-14 13:22:46 +04:00
key = kvm_async_pf_next_probe ( key ) ;
return key ;
}
bool kvm_find_async_pf_gfn ( struct kvm_vcpu * vcpu , gfn_t gfn )
{
return vcpu - > arch . apf . gfns [ kvm_async_pf_gfn_slot ( vcpu , gfn ) ] = = gfn ;
}
static void kvm_del_async_pf_gfn ( struct kvm_vcpu * vcpu , gfn_t gfn )
{
u32 i , j , k ;
i = j = kvm_async_pf_gfn_slot ( vcpu , gfn ) ;
while ( true ) {
vcpu - > arch . apf . gfns [ i ] = ~ 0 ;
do {
j = kvm_async_pf_next_probe ( j ) ;
if ( vcpu - > arch . apf . gfns [ j ] = = ~ 0 )
return ;
k = kvm_async_pf_hash_fn ( vcpu - > arch . apf . gfns [ j ] ) ;
/*
* k lies cyclically in ] i , j ]
* | i . k . j |
* | . . . . j i . k . | or | . k . . j i . . . |
*/
} while ( ( i < = j ) ? ( i < k & & k < = j ) : ( i < k | | k < = j ) ) ;
vcpu - > arch . apf . gfns [ i ] = vcpu - > arch . apf . gfns [ j ] ;
i = j ;
}
}
2010-10-14 13:22:53 +04:00
static int apf_put_user ( struct kvm_vcpu * vcpu , u32 val )
{
return kvm_write_guest_cached ( vcpu - > kvm , & vcpu - > arch . apf . data , & val ,
sizeof ( val ) ) ;
}
2010-10-14 13:22:46 +04:00
void kvm_arch_async_page_not_present ( struct kvm_vcpu * vcpu ,
struct kvm_async_pf * work )
{
2010-11-29 17:12:30 +03:00
struct x86_exception fault ;
2010-10-14 13:22:53 +04:00
trace_kvm_async_pf_not_present ( work - > arch . token , work - > gva ) ;
2010-10-14 13:22:46 +04:00
kvm_add_async_pf_gfn ( vcpu , work - > arch . gfn ) ;
2010-10-14 13:22:53 +04:00
if ( ! ( vcpu - > arch . apf . msr_val & KVM_ASYNC_PF_ENABLED ) | |
2010-10-14 13:22:56 +04:00
( vcpu - > arch . apf . send_user_only & &
kvm_x86_ops - > get_cpl ( vcpu ) = = 0 ) )
2010-10-14 13:22:53 +04:00
kvm_make_request ( KVM_REQ_APF_HALT , vcpu ) ;
else if ( ! apf_put_user ( vcpu , KVM_PV_REASON_PAGE_NOT_PRESENT ) ) {
2010-11-29 17:12:30 +03:00
fault . vector = PF_VECTOR ;
fault . error_code_valid = true ;
fault . error_code = 0 ;
fault . nested_page_fault = false ;
fault . address = work - > arch . token ;
kvm_inject_page_fault ( vcpu , & fault ) ;
2010-10-14 13:22:53 +04:00
}
2010-10-14 13:22:46 +04:00
}
void kvm_arch_async_page_present ( struct kvm_vcpu * vcpu ,
struct kvm_async_pf * work )
{
2010-11-29 17:12:30 +03:00
struct x86_exception fault ;
2010-10-14 13:22:53 +04:00
trace_kvm_async_pf_ready ( work - > arch . token , work - > gva ) ;
if ( is_error_page ( work - > page ) )
work - > arch . token = ~ 0 ; /* broadcast wakeup */
else
kvm_del_async_pf_gfn ( vcpu , work - > arch . gfn ) ;
if ( ( vcpu - > arch . apf . msr_val & KVM_ASYNC_PF_ENABLED ) & &
! apf_put_user ( vcpu , KVM_PV_REASON_PAGE_READY ) ) {
2010-11-29 17:12:30 +03:00
fault . vector = PF_VECTOR ;
fault . error_code_valid = true ;
fault . error_code = 0 ;
fault . nested_page_fault = false ;
fault . address = work - > arch . token ;
kvm_inject_page_fault ( vcpu , & fault ) ;
2010-10-14 13:22:53 +04:00
}
2010-11-01 12:01:28 +03:00
vcpu - > arch . apf . halted = false ;
2012-05-03 12:36:39 +04:00
vcpu - > arch . mp_state = KVM_MP_STATE_RUNNABLE ;
2010-10-14 13:22:53 +04:00
}
bool kvm_arch_can_inject_async_page_present ( struct kvm_vcpu * vcpu )
{
if ( ! ( vcpu - > arch . apf . msr_val & KVM_ASYNC_PF_ENABLED ) )
return true ;
else
return ! kvm_event_needs_reinjection ( vcpu ) & &
kvm_x86_ops - > interrupt_allowed ( vcpu ) ;
2010-10-14 13:22:46 +04:00
}
2009-06-17 16:22:14 +04:00
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_exit ) ;
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_inj_virq ) ;
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_page_fault ) ;
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_msr ) ;
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_cr ) ;
2009-10-09 18:08:27 +04:00
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_nested_vmrun ) ;
2009-10-09 18:08:28 +04:00
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_nested_vmexit ) ;
2009-10-09 18:08:29 +04:00
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_nested_vmexit_inject ) ;
2009-10-09 18:08:30 +04:00
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_nested_intr_vmexit ) ;
2009-10-09 18:08:31 +04:00
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_invlpga ) ;
2009-10-09 18:08:32 +04:00
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_skinit ) ;
2010-02-24 20:59:14 +03:00
EXPORT_TRACEPOINT_SYMBOL_GPL ( kvm_nested_intercepts ) ;