399941 Commits

Author SHA1 Message Date
Marc Zyngier
9cbb6d969c ARM: KVM: fix L2CTLR to be per-cluster
The L2CTLR register contains the number of CPUs in this cluster.

Make sure the register content is actually relevant to the vcpu
that is being configured by computing the number of cores that are
part of its cluster.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-22 08:00:06 -07:00
Marc Zyngier
2d1d841bd4 ARM: KVM: Fix MPIDR computing to support virtual clusters
In order to be able to support more than 4 A7 or A15 CPUs,
we need to fix the MPIDR computing to reflect the fact that
both A15 and A7 can only exist in clusters of at most 4 CPUs.

Fix the MPIDR computing to allow virtual clusters to be exposed
to the guest.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-22 08:00:06 -07:00
Christoffer Dall
e4b3c9c21b Merge branch 'kvm-arm-next-3.13-2' into kvm-arm-next 2013-10-17 20:41:44 -07:00
Christoffer Dall
2f8d01a147 Merge remote-tracking branch 'kvm/next' into kvm-arm-next 2013-10-17 20:40:08 -07:00
Christoffer Dall
9b5fdb9781 KVM: ARM: Transparent huge page (THP) support
Support transparent huge pages in KVM/ARM and KVM/ARM64.  The
transparent_hugepage_adjust is not very pretty, but this is also how
it's solved on x86 and seems to be simply an artifact on how THPs
behave.  This should eventually be shared across architectures if
possible, but that can always be changed down the road.

Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17 17:06:30 -07:00
Christoffer Dall
ad361f093c KVM: ARM: Support hugetlbfs backed huge pages
Support huge pages in KVM/ARM and KVM/ARM64.  The pud_huge checking on
the unmap path may feel a bit silly as the pud_huge check is always
defined to false, but the compiler should be smart about this.

Note: This deals only with VMAs marked as huge which are allocated by
users through hugetlbfs only.  Transparent huge pages can only be
detected by looking at the underlying pages (or the page tables
themselves) and this patch so far simply maps these on a page-by-page
level in the Stage-2 page tables.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17 17:06:20 -07:00
Christoffer Dall
86ed81aa2e KVM: ARM: Update comments for kvm_handle_wfi
Update comments to reflect what is really going on and add the TWE bit
to the comments in kvm_arm.h.

Also renames the function to kvm_handle_wfx like is done on arm64 for
consistency and uber-correctness.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17 15:26:50 -07:00
Marc Zyngier
58d5ec8f8e ARM: KVM: Yield CPU when vcpu executes a WFE
On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.

This creates contention, and the observed slowdown is 40x for
hackbench. No, this isn't a typo.

The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.

Quick test to estimate the performance: hackbench 1 process 1000

2xA15 host (baseline):	1.843s

2xA15 guest w/o patch:	2.083s
4xA15 guest w/o patch:	80.212s
8xA15 guest w/o patch:	Could not be bothered to find out

2xA15 guest w/ patch:	2.102s
4xA15 guest w/ patch:	3.205s
8xA15 guest w/ patch:	6.887s

So we go from a 40x degradation to 1.5x in the 2x overcommit case,
which is vaguely more acceptable.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-17 15:26:50 -07:00
Gleb Natapov
d570142674 Updates for KVM/ARM including cpu=host and Cortex-A7 support
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.11 (GNU/Linux)
 
 iQEcBAABAgAGBQJSXeGaAAoJEEtpOizt6ddyeyYH/AnWdKGUELjxC0lIBDkTitnD
 znyzSxqXG6z1Z6d+EYI3XCL1eB3dtyOBSJsZj45adG4HXGkCmGqosgDzivGO6GcI
 yhjYgXGhP8ZvIwky1ijbVQODaEE70SEYqKwyCpU4rLJw2uRkbfRaxTrpgnusL8Bg
 RG37uaOS/sasLoNxCe5GEUjm8BFGbvZGVAjcL7yJTPBw5qd7GYBxndFSTILa2iRQ
 ikoBD0bUVhoaBUqSNQenoNllUBwDpFJF1HiEXKMJkUIxX/FggrSvRp8A/MAWDBw0
 6Ef1P8Pt/hMfMQpOOeu8QFWM2s+smh2rTkO/O9mqi/tSvEf5YcZHMAl48B8OR88=
 =tJ2u
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-3.13-1' of git://git.linaro.org/people/cdall/linux-kvm-arm into next

Updates for KVM/ARM including cpu=host and Cortex-A7 support
2013-10-16 15:30:32 +03:00
Christoffer Dall
82ea046c95 KVM: ARM: Update comments for kvm_handle_wfi
Update comments to reflect what is really going on and add the TWE bit
to the comments in kvm_arm.h.

Also renames the function to kvm_handle_wfx like is done on arm64 for
consistency and uber-correctness.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-15 18:10:42 -07:00
Marc Zyngier
1f5580986a ARM: KVM: Yield CPU when vcpu executes a WFE
On an (even slightly) oversubscribed system, spinlocks are quickly
becoming a bottleneck, as some vcpus are spinning, waiting for a
lock to be released, while the vcpu holding the lock may not be
running at all.

This creates contention, and the observed slowdown is 40x for
hackbench. No, this isn't a typo.

The solution is to trap blocking WFEs and tell KVM that we're
now spinning. This ensures that other vpus will get a scheduling
boost, allowing the lock to be released more quickly. Also, using
CONFIG_HAVE_KVM_CPU_RELAX_INTERCEPT slightly improves the performance
when the VM is severely overcommited.

Quick test to estimate the performance: hackbench 1 process 1000

2xA15 host (baseline):	1.843s

2xA15 guest w/o patch:	2.083s
4xA15 guest w/o patch:	80.212s
8xA15 guest w/o patch:	Could not be bothered to find out

2xA15 guest w/ patch:	2.102s
4xA15 guest w/ patch:	3.205s
8xA15 guest w/ patch:	6.887s

So we go from a 40x degradation to 1.5x in the 2x overcommit case,
which is vaguely more acceptable.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-15 18:02:05 -07:00
Christoffer Dall
a7265fb175 KVM: ARM: Remove non-ASCII space characters
Some strange character leaped into the documentation, which makes
git-send-email behave quite strangely.  Get rid of this before it bites
anyone else.

Cc: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-15 17:43:00 -07:00
chai wen
f2e106692d KVM: Drop FOLL_GET in GUP when doing async page fault
Page pinning is not mandatory in kvm async page fault processing since
after async page fault event is delivered to a guest it accesses page once
again and does its own GUP.  Drop the FOLL_GET flag in GUP in async_pf
code, and do some simplifying in check/clear processing.

Suggested-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Gu zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: chai wen <chaiw.fnst@cn.fujitsu.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-15 13:43:37 +03:00
Christoffer Dall
a7efdf6bec KVM: s390: Get rid of KVM_HPAGE defines
Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-14 10:12:13 +03:00
Christoffer Dall
2c5350e934 KVM: PPC: Get rid of KVM_HPAGE defines
Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-14 10:12:11 +03:00
Christoffer Dall
bbba7938d1 KVM: ia64: Get rid of KVM_HPAGE defines
Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-14 10:12:09 +03:00
Christoffer Dall
015e0513a0 KVM: mips: Get rid of KVM_HPAGE defines
Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-14 10:12:08 +03:00
Christoffer Dall
ef0cfe71c2 KVM: arm64: Get rid of KVM_HPAGE defines
Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-14 10:12:06 +03:00
Christoffer Dall
dc6f6763df KVM: ARM: Get rid of KVM_HPAGE defines
The KVM_HPAGE_DEFINES are a little artificial on ARM, since the huge
page size is statically defined at compile time and there is only a
single huge page size.

Now when the main kvm code relying on these defines has been moved to
the x86 specific part of the world, we can get rid of these.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-14 10:11:56 +03:00
Christoffer Dall
6d9d41e574 KVM: Move gfn_to_index to x86 specific code
The gfn_to_index function relies on huge page defines which either may
not make sense on systems that don't support huge pages or are defined
in an unconvenient way for other architectures.  Since this is
x86-specific, move the function to arch/x86/include/asm/kvm_host.h.

Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-14 10:11:44 +03:00
Jonathan Austin
e8c2d99f82 KVM: ARM: Add support for Cortex-A7
This patch adds support for running Cortex-A7 guests on Cortex-A7 hosts.

As Cortex-A7 is architecturally compatible with A15, this patch is largely just
generalising existing code. Areas where 'implementation defined' behaviour
is identical for A7 and A15 is moved to allow it to be used by both cores.

The check to ensure that coprocessor register tables are sorted correctly is
also moved in to 'common' code to avoid each new cpu doing its own check
(and possibly forgetting to do so!)

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-12 17:45:30 -07:00
Jonathan Austin
5e497046f0 KVM: ARM: fix the size of TTBCR_{T0SZ,T1SZ} masks
The T{0,1}SZ fields of TTBCR are 3 bits wide when using the long descriptor
format. Likewise, the T0SZ field of the HTCR is 3-bits. KVM currently
defines TTBCR_T{0,1}SZ as 3, not 7.

The T0SZ mask is used to calculate the value for the HTCR, both to pick out
TTBCR.T0SZ and mask off the equivalent field in the HTCR during
read-modify-write. The incorrect mask size causes the (UNKNOWN) reset value
of HTCR.T0SZ to leak in to the calculated HTCR value. Linux will hang when
initializing KVM if HTCR's reset value has bit 2 set (sometimes the case on
A7/TC2)

Fixing T0SZ allows A7 cores to boot and T1SZ is also fixed for completeness.

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-12 17:44:39 -07:00
Jonathan Austin
1158fca401 KVM: ARM: Fix calculation of virtual CPU ID
KVM does not have a notion of multiple clusters for CPUs, just a linear
array of CPUs. When using a system with cores in more than one cluster, the
current method for calculating the virtual MPIDR will leak the (physical)
cluster information into the virtual MPIDR. One effect of this is that
Linux under KVM fails to boot multiple CPUs that aren't in the 0th cluster.

This patch does away with exposing the real MPIDR fields in favour of simply
using the virtual CPU number (but preserving the U bit, as before).

Signed-off-by: Jonathan Austin <jonathan.austin@arm.com>
Acked-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-12 17:44:39 -07:00
Arthur Chunqi Li
7854cbca81 KVM: nVMX: Fully support nested VMX preemption timer
This patch contains the following two changes:
1. Fix the bug in nested preemption timer support. If vmexit L2->L0
with some reasons not emulated by L1, preemption timer value should
be save in such exits.
2. Add support of "Save VMX-preemption timer value" VM-Exit controls
to nVMX.

With this patch, nested VMX preemption timer features are fully
supported.

Signed-off-by: Arthur Chunqi Li <yzt356@gmail.com>
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-10-10 18:22:54 +02:00
Paolo Bonzini
8a3c1a3347 KVM: mmu: change useless int return types to void
kvm_mmu initialization is mostly filling in function pointers, there is
no way for it to fail.  Clean up unused return values.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03 15:44:02 +03:00
Paolo Bonzini
95f93af4ad KVM: mmu: unify destroy_kvm_mmu with kvm_mmu_unload
They do the same thing, and destroy_kvm_mmu can be confused with
kvm_mmu_destroy.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03 15:44:01 +03:00
Paolo Bonzini
d8d173dab2 KVM: mmu: remove uninteresting MMU "new_cr3" callbacks
The new_cr3 MMU callback has been a wrapper for mmu_free_roots since commit
e676505 (KVM: MMU: Force cr3 reload with two dimensional paging on mov
cr3 emulation, 2012-07-08).

The commit message mentioned that "mmu_free_roots() is somewhat of an overkill,
but fixing that is more complicated and will be done after this minimal fix".
One year has passed, and no one really felt the need to do a different fix.
Wrap the call with a kvm_mmu_new_cr3 function for clarity, but remove the
callback.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03 15:43:59 +03:00
Paolo Bonzini
206260941f KVM: mmu: remove uninteresting MMU "free" callbacks
The free MMU callback has been a wrapper for mmu_free_roots since mmu_free_roots
itself was introduced (commit 17ac10a, [PATCH] KVM: MU: Special treatment
for shadow pae root pages, 2007-01-05), and has always been the same for all
MMU cases.  Remove the indirection as it is useless.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03 15:43:56 +03:00
Paolo Bonzini
4344ee981e KVM: x86: only copy XSAVE state for the supported features
This makes the interface more deterministic for userspace, which can expect
(after configuring only the features it supports) to get exactly the same
state from the kernel, independent of the host CPU and kernel version.

Suggested-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03 12:29:09 +03:00
Paolo Bonzini
d7876f1be4 KVM: x86: prevent setting unsupported XSAVE states
A guest can still attempt to save and restore XSAVE states even if they
have been masked in CPUID leaf 0Dh.  This usually is not visible to
the guest, but is still wrong: "Any attempt to set a reserved bit (as
determined by the contents of EAX and EDX after executing CPUID with
EAX=0DH, ECX= 0H) in XCR0 for a given processor will result in a #GP
exception".

The patch also performs the same checks as __kvm_set_xcr in KVM_SET_XSAVE.
This catches migration from newer to older kernel/processor before the
guest starts running.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03 12:29:07 +03:00
Paolo Bonzini
647e23bb33 KVM: x86: mask unsupported XSAVE entries from leaf 0Dh index 0
XSAVE entries that KVM does not support are reported by
KVM_GET_SUPPORTED_CPUID for leaf 0Dh index 0 if the host supports them;
they should be left out unless there is also hypervisor support for them.

Sub-leafs are correctly handled in supported_xcr0_bit, fix index 0
to match.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03 12:29:04 +03:00
Andre Richter
29242cb5c6 virt/kvm/iommu.c: Add leading zeros to device's BDF notation in debug messages
When KVM (de)assigns PCI(e) devices to VMs, a debug message is printed
including the BDF notation of the respective device. Currently, the BDF
notation does not have the commonly used leading zeros. This produces
messages like "assign device 0:1:8.0", which look strange at first sight.

The patch fixes this by exchanging the printk(KERN_DEBUG ...) with dev_info()
and also inserts "kvm" into the debug message, so that it is obvious where
the message comes from. Also reduces LoC.

Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Andre Richter <andre.o.richter@gmail.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
2013-10-03 11:47:02 +03:00
Anup Patel
740edfc0a3 KVM: Add documentation for KVM_ARM_PREFERRED_TARGET ioctl
To implement CPU=Host we have added KVM_ARM_PREFERRED_TARGET
vm ioctl which provides information to user space required for
creating VCPU matching underlying Host.

This patch adds info related to this new KVM_ARM_PREFERRED_TARGET
vm ioctl in the KVM API documentation.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-02 11:29:49 -07:00
Anup Patel
42c4e0c77a ARM/ARM64: KVM: Implement KVM_ARM_PREFERRED_TARGET ioctl
For implementing CPU=host, we need a mechanism for querying
preferred VCPU target type on underlying Host.

This patch implements KVM_ARM_PREFERRED_TARGET vm ioctl which
returns struct kvm_vcpu_init instance containing information
about preferred VCPU target type and target specific features
available for it.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-02 11:29:48 -07:00
Anup Patel
473bdc0e65 ARM64: KVM: Implement kvm_vcpu_preferred_target() function
This patch implements kvm_vcpu_preferred_target() function for
KVM ARM64 which will help us implement KVM_ARM_PREFERRED_TARGET
ioctl for user space.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-02 11:29:40 -07:00
Anup Patel
4a6fee805d ARM: KVM: Implement kvm_vcpu_preferred_target() function
This patch implements kvm_vcpu_preferred_target() function for
KVM ARM which will help us implement KVM_ARM_PREFERRED_TARGET ioctl
for user space.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-02 11:29:10 -07:00
Anup Patel
b373e492f3 KVM: ARM: Fix typo in comments of inject_abt()
Very minor typo in comments of inject_abt() when we update fault status
register for injecting prefetch abort.

Signed-off-by: Anup Patel <anup.patel@linaro.org>
Signed-off-by: Pranavkumar Sawargaonkar <pranavkumar@linaro.org>
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
2013-10-02 17:29:19 +01:00
Paolo Bonzini
2f303b74a6 KVM: Convert kvm_lock back to non-raw spinlock
In commit e935b8372cf8 ("KVM: Convert kvm_lock to raw_spinlock"),
the kvm_lock was made a raw lock.  However, the kvm mmu_shrink()
function tries to grab the (non-raw) mmu_lock within the scope of
the raw locked kvm_lock being held.  This leads to the following:

BUG: sleeping function called from invalid context at kernel/rtmutex.c:659
in_atomic(): 1, irqs_disabled(): 0, pid: 55, name: kswapd0
Preemption disabled at:[<ffffffffa0376eac>] mmu_shrink+0x5c/0x1b0 [kvm]

Pid: 55, comm: kswapd0 Not tainted 3.4.34_preempt-rt
Call Trace:
 [<ffffffff8106f2ad>] __might_sleep+0xfd/0x160
 [<ffffffff817d8d64>] rt_spin_lock+0x24/0x50
 [<ffffffffa0376f3c>] mmu_shrink+0xec/0x1b0 [kvm]
 [<ffffffff8111455d>] shrink_slab+0x17d/0x3a0
 [<ffffffff81151f00>] ? mem_cgroup_iter+0x130/0x260
 [<ffffffff8111824a>] balance_pgdat+0x54a/0x730
 [<ffffffff8111fe47>] ? set_pgdat_percpu_threshold+0xa7/0xd0
 [<ffffffff811185bf>] kswapd+0x18f/0x490
 [<ffffffff81070961>] ? get_parent_ip+0x11/0x50
 [<ffffffff81061970>] ? __init_waitqueue_head+0x50/0x50
 [<ffffffff81118430>] ? balance_pgdat+0x730/0x730
 [<ffffffff81060d2b>] kthread+0xdb/0xe0
 [<ffffffff8106e122>] ? finish_task_switch+0x52/0x100
 [<ffffffff817e1e94>] kernel_thread_helper+0x4/0x10
 [<ffffffff81060c50>] ? __init_kthread_worker+0x

After the previous patch, kvm_lock need not be a raw spinlock anymore,
so change it back.

Reported-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30 09:21:51 +02:00
Paolo Bonzini
4a937f96f3 KVM: protect kvm_usage_count with its own spinlock
The VM list need not be protected by a raw spinlock.  Separate the
two so that kvm_lock can be made non-raw.

Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30 09:21:46 +02:00
Paolo Bonzini
4fa92fb25a KVM: cleanup (physical) CPU hotplug
Remove the useless argument, and do not do anything if there are no
VMs running at the time of the hotplug.

Cc: kvm@vger.kernel.org
Cc: gleb@redhat.com
Cc: jan.kiszka@siemens.com
Reviewed-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30 09:21:30 +02:00
Gleb Natapov
feaf0c7dc4 KVM: nVMX: Do not generate #DF if #PF happens during exception delivery into L2
If #PF happens during delivery of an exception into L2 and L1 also do
not have the page mapped in its shadow page table then L0 needs to
generate vmexit to L2 with original event in IDT_VECTORING_INFO, but
current code combines both exception and generates #DF instead. Fix that
by providing nVMX specific function to handle page faults during page
table walk that handles this case correctly.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30 09:14:25 +02:00
Gleb Natapov
e011c663b9 KVM: nVMX: Check all exceptions for intercept during delivery to L2
All exceptions should be checked for intercept during delivery to L2,
but we check only #PF currently. Drop nested_run_pending while we are
at it since exception cannot be injected during vmentry anyway.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
[Renamed the nested_vmx_check_exception function. - Paolo]
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30 09:14:24 +02:00
Gleb Natapov
851eb6677c KVM: nVMX: Do not put exception that caused vmexit to IDT_VECTORING_INFO
If an exception causes vmexit directly it should not be reported in
IDT_VECTORING_INFO during the exit. For that we need to be able to
distinguish between exception that is injected into nested VM and one that
is reinjected because its delivery failed. Fortunately we already have
mechanism to do so for nested SVM, so here we just use correct function
to requeue exceptions and make sure that reinjected exception is not
moved to IDT_VECTORING_INFO during vmexit emulation and not re-checked
for interception during delivery.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30 09:14:24 +02:00
Gleb Natapov
e0b890d35c KVM: nVMX: Amend nested_run_pending logic
EXIT_REASON_VMLAUNCH/EXIT_REASON_VMRESUME exit does not mean that nested
VM will actually run during next entry. Move setting nested_run_pending
closer to vmentry emulation code and move its clearing close to vmexit to
minimize amount of code that will erroneously run with nested_run_pending
set.

Signed-off-by: Gleb Natapov <gleb@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-30 09:14:23 +02:00
Thomas Huth
6a3f95a6b0 KVM: s390: Intercept SCK instruction
Interception of the SET CLOCK instruction is mandatory, so this patch
provides a simple handler for this instruction (by setting up the
"epoch" field in the sie_block).

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-24 19:12:22 +02:00
Thomas Huth
aca84241b5 KVM: s390: Implement TEST BLOCK
This patch provides a simple version for the mandatory TEST BLOCK
instruction interception, so that guests that use this instruction
do not crash anymore.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-24 19:12:21 +02:00
Thomas Huth
732e563373 KVM: s390: Helper for converting real addresses to absolute
Added a separate helper function that translates guest real addresses
to guest absolute addresses by applying the prefix of the guest CPU.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-24 19:12:20 +02:00
Thomas Huth
ff7158b227 KVM: s390: Allow NULL parameter for kvm_s390_get_regs_rre
We're not always interested in both registers that are specified
for an RRE instruction. So allow NULL as parameter, too, to indicate
that we do not need the corresponding value.

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-24 19:12:20 +02:00
Thomas Huth
800c1065c3 KVM: s390: Lock kvm->srcu at the appropriate places
The kvm->srcu lock has to be held while accessing the memory of
guests and during certain other actions. This patch now adds
the locks to the __vcpu_run function so that all affected code
is protected now (and additionally to the KVM_S390_STORE_STATUS
ioctl, which can be called out-of-band and needs a separate lock).

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-24 19:12:19 +02:00
Thomas Huth
a76ccff6f5 KVM: s390: Push run loop into __vcpu_run
Moved the do-while loop from kvm_arch_vcpu_ioctl_run into __vcpu_run
and the calling of kvm_handle_sie_intercept() into vcpu_post_run()
(so we can add the srcu locks in a proper way in the next patch).

Signed-off-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-09-24 19:12:18 +02:00