linux

iv/linux

Author	SHA1	Message	Date
Martin Schwidefsky	affbb42023	[S390] Fix compile error in swab.h The inline assembly in__arch_swab16p causes compile errors of the form: error: invalid 'asm': operand number missing after %-letter The assembly uses the %O<n>/%R<n> notation but the first operand misses the operand number, it needs to be "%O1" instead of "%O". Reported-by: Gil Peleg <gilpeleg@servframe.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-04-11 14:28:26 +02:00
Michael Holzheu	37e37c20ab	[S390] Fix stfle() lowcore protection problem The stfle() function writes into lowcore memory when stfl_fac_list is initialized with "S390_lowcore.stfl_fac_list = 0". For older compilers this triggers a lowcore exception. With newer compilers and "-OXX" compile option the bug does not show up because the "S390_lowcore.stfl_fac_list" initialization is removed by the compiler. The reason for thatis the incorrect "=m" (S390_lowcore.stfl_fac_list) constraint in the stfl inline assembly. The following shows the disassembly of the stfle() optimized code that is inlined in the lgr_info_get() function: 000000000011325c <lgr_info_get>: 11325c: eb 9f f0 60 00 24 stmg %r9,%r15,96(%r15) 113262: c0 d0 00 29 0e 47 larl %r13,634ef0 <servi..> 113268: a7 f1 3f c0 tml %r15,16320 11326c: b9 04 00 ef lgr %r14,%r15 113270: a7 84 00 01 je 113272 <lgr_info_g..> 113274: a7 fb ff c0 aghi %r15,-64 113278: b9 04 00 c2 lgr %r12,%r2 11327c: a7 29 00 01 lghi %r2,1 113280: e3 e0 f0 98 00 24 stg %r14,152(%r15) 113286: d7 97 c0 00 c0 00 xc 0(152,%r12),0(%r12) 11328c: c0 e5 00 28 db 4c brasl %r14,62e924 <add_e..> 113292: b2 b1 00 00 stfl 0 To fix the problem we now clear the S390_lowcore.stfl_fac_list at startup in "head.S" for all machine types before lowcore protection is enabled. In addition to that the "=m" constraint is replaced by "+m". Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-04-11 14:28:26 +02:00
Heiko Carstens	af0ee94e54	[S390] cpum_cf: get rid of compile warnings Fix these: arch/s390/kernel/perf_cpum_cf.c:180:3: warning: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'int' [-Wformat] arch/s390/kernel/perf_cpum_cf.c: In function 'cpumf_pmu_disable': arch/s390/kernel/perf_cpum_cf.c:205:3: warning: format '%lx' expects argument of type 'long unsigned int', but argument 2 has type 'int' [-Wformat] Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-04-11 14:28:25 +02:00
Heiko Carstens	7968ca8148	[S390] irq: simple coding style change Use braces for if/else/list_for_each_entry bodies if the body consists of more than a single line. Otherwise I get confused and check if there is something broken whenever I see these code snippets. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-04-11 14:28:25 +02:00
Michael Holzheu	ac2ac6e852	[S390] update default configuration Add TASKSTATS options, enable CRASH_DUMP, and regenerate defconfig file with "make savedefconfig". Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-04-11 14:28:25 +02:00
Martin Schwidefsky	cd94154cc6	[S390] fix tlb flushing for page table pages Git commit 36409f6353fc2d7b6516e631415f938eadd92ffa "use generic RCU page-table freeing code" introduced a tlb flushing bug. Partially revert the above git commit and go back to s390 specific page table flush code. For s390 the TLB can contain three types of entries, "normal" TLB page-table entries, TLB combined region-and-segment-table (CRST) entries and real-space entries. Linux does not use real-space entries which leaves normal TLB entries and CRST entries. The CRST entries are intermediate steps in the page-table translation called translation paths. For example a 4K page access in a three-level page table setup will create two CRST TLB entries and one page-table TLB entry. The advantage of that approach is that a page access next to the previous one can reuse the CRST entries and needs just a single read from memory to create the page-table TLB entry. The disadvantage is that the TLB flushing rules are more complicated, before any page-table may be freed the TLB needs to be flushed. In short: the generic RCU page-table freeing code is incorrect for the CRST entries, in particular the check for mm_users < 2 is troublesome. This is applicable to 3.0+ kernels. Cc: <stable@vger.kernel.org> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-04-11 14:28:24 +02:00
Michael Holzheu	b785e0d06a	[S390] kernel: Use local_irq_save() for memcpy_real() Currently in the memcpy_real() function interrupts are disabled with __arch_local_irq_stnsm(). In order to notify lockdep that interrupts are disabled, with this patch local_irq_save() is used instead. The function __arch_local_irq_stnsm() is still used for switching to real mode. Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com> Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>	2012-04-11 14:28:24 +02:00
Grant Likely	a699e4e49e	irq: Kill pointless irqd_to_hw export It makes no sense to export this trivial function. Make it a static inline instead. This patch also drops virq_to_hw from arch/c6x since it is unused by that architecture. v2: Move irq_hw_number_t into types.h to fix ARM build failure Signed-off-by: Grant Likely <grant.likely@secretlab.ca> Acked-by: Thomas Gleixner <tglx@linutronix.de> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-04-10 22:39:17 -06:00
Benjamin Herrenschmidt	fae2e0fb24	powerpc: Fix typo in runlatch code Commit fe1952fc0afb9a2e4c79f103c08aef5d13db1873 "powerpc: Rework runlatch code" has a nasty typo where it uses "TLF_RUNLATCH" instead of "_TLF_RUNLATCH" (bit number instead of bit mask), causing some flags to be potentially lost such as _TLF_RESTORE_SIGMASK (Brown paper bag for me ! We should be able to make that break at compile time with a bit of magic, any volunteer ?) Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-04-11 10:42:15 +10:00
Linus Torvalds	a9e1e53bcf	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc Pull sparc fixes from David Miller: 1) Build fix for LEON, from Sam Ravnborg. 2) Make the sparc side changes that go along with the infrastructure to retry faults when blocking on a disk transfer. From Kautuk Consul. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc32,leon: fix leon build sparc/mm/fault_32.c: Port OOM changes to do_sparc_fault sparc/mm/fault_64.c: Port OOM changes to do_sparc64_fault	2012-04-10 10:28:26 -07:00
Thomas Abraham	60806225cd	ARM: EXYNOS: Add PDMA and MDMA physical base address defines Add PDMA and MDMA physical base address macros which is require for EXYNOS5 of_dev_auxdata setup. Signed-off-by: Thomas Abraham <thomas.ab@samsung.com> [kgene.kim@samsung.com: changed dma channel fo mdma1] Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2012-04-10 08:41:35 -07:00
Sachin Kamat	6e7beb7ec8	ARM: S5PV210: Fix compiler warning in dma.c file Fixes the following warning: warning: 'dma_dmamask' defined but not used [-Wunused-variable] Signed-off-by: Sachin Kamat <sachin.kamat@linaro.org> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2012-04-10 08:40:19 -07:00
Gleb Natapov	f19a0c2c2e	KVM: PMU emulation: GLOBAL_CTRL MSR should be enabled on reset On reset all MPU counters should be enabled in GLOBAL_CTRL MSR. Signed-off-by: Gleb Natapov <gleb@redhat.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-10 15:34:10 +03:00
Will Deacon	34af657916	ARM: 7377/1: vic: re-read status register before dispatching each IRQ handler handle_IRQ may briefly cause interrupts to be re-enabled during soft IRQ processing on the exit path, leading to nested handling of VIC interrupts. Since the current code does not re-read the VIC_IRQ_STATUS register, this can lead to multiple invocations of the same interrupt handler and spurious interrupts to be reported. This patch changes the VIC interrupt dispatching code to re-read the status register each time, avoiding duplicate invocations of the same handler. Acked-and-Tested-by: H Hartley Sweeten <hsweeten@visionengravers.com> Reviewed-by: Jamie Iles <jamie@jamieiles.com> Signed-off-by: Will Deacon <will.deacon@arm.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2012-04-10 09:27:42 +01:00
Kautuk Consul	dff2aa7af8	ARM: 7368/1: fault.c: correct how the tsk->[maj\|min]_flt gets incremented commit 8878a539ff19a43cf3729e7562cd528f490246ae was done by me to make the page fault handler retryable as well as interruptible. Due to this commit, there is a mistake in the way in which tsk->[maj\|min]_flt counter gets incremented for VM_FAULT_ERROR: If VM_FAULT_ERROR is returned in the fault flags by handle_mm_fault, then either maj_flt or min_flt will get incremented. This is wrong as in the case of a VM_FAULT_ERROR we need to be skip ahead to the error handling code in do_page_fault. Added a check after the call to __do_page_fault() to check for (fault & VM_FAULT_ERROR). Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>	2012-04-10 09:27:42 +01:00
Benjamin Herrenschmidt	08f1ec8a59	powerpc: Fix page fault with lockdep regression commit a546498f3bf9aac311c66f965186373aee2ca0b0 introduced a regression on 32-bit when irq tracing is enabled by exposing an old bug in our irq tracing code for exception entry. The code would save and restore some GPRs around the calls to the C lockdep code, however, it tries to be too smart for its own good and restores some of the GPRs from the exception frame (as saved there on exception entry). However, for page faults, we do replace those GPRs with arguments to do_page_fault before we call transfer_to_handler and so restoring from the exception frame is plain wrong in this case. This was fine as long as we didn't touch the interrupt state when taking page fault, but when I started doing it, it would trigger the lockdep calls and the bug. This fixes it by cleaning up that code a bit. It did create a small stack frame for the sake of backtraces, so let's make it a bit bigger and use it to save and restore the stuff we care about. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2012-04-10 17:21:35 +10:00
Boaz Harrosh	657b12d3a1	um: uml_setup_stubs': warning: unused variable 'pages' Fix the following gcc complain arch/um/kernel/skas/mmu.c: In function 'uml_setup_stubs': arch/um/kernel/skas/mmu.c:106:16: warning: unused variable 'pages' [-Wunused-variable] Signed-Signed-off-by: Boaz Harrosh <bharrosh@panasas.com> Signed-off-by: Richard Weinberger <richard@nod.at>	2012-04-10 00:13:45 +02:00
Richard Weinberger	76b278edd9	um: Use asm-generic/switch_to.h Signed-off-by: Richard Weinberger <richard@nod.at>	2012-04-10 00:13:45 +02:00
Richard Weinberger	a3a85a763c	um: Disintegrate asm/system.h Signed-off-by: Richard Weinberger <richard@nod.at> Reported-by: Toralf Förster <toralf.foerster@gmx.de> CC: dhowells@redhat.com	2012-04-10 00:13:45 +02:00
Al Viro	d824d06328	um: switch cow_user.h to htobe{32,64}/betoh{32,64} ... rather than open-coding the 64bit versions. endian.h has those guys. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Richard Weinberger <richard@nod.at>	2012-04-10 00:13:45 +02:00
Srivatsa S. Bhat	d1640130cd	tile/CPU hotplug: Add missing call to notify_cpu_starting() The scheduler depends on receiving the CPU_STARTING notification, without which we end up into a lot of trouble. So add the missing call to notify_cpu_starting() in the bringup code. Signed-off-by: Srivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com> Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>	2012-04-09 14:02:23 -04:00
Al Viro	3cb42092ff	um: fix linker script generation while we can't just use -U$(SUBARCH), we still need to kill idiotic define (implicit -Di386=1), both for SUBARCH=i386 and SUBARCH=x86/CONFIG_64BIT=n builds. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2012-04-09 13:59:00 -04:00
Jonghwan Choi	32db797f10	ARM: EXYNOS: Fix compile error in exynos5250-cpufreq.c This patch is omitted in v2 patch of Jaecheol Lee. drivers/cpufreq/exynos5250-cpufreq.c: In function 'set_clkdiv': drivers/cpufreq/exynos5250-cpufreq.c:144: error: 'EXYNOS5_CLKDIV_STATCPU0' undeclared (first use in this function) drivers/cpufreq/exynos5250-cpufreq.c:144: error: (Each undeclared identifier is reported only once drivers/cpufreq/exynos5250-cpufreq.c:144: error: for each function it appears in.) drivers/cpufreq/exynos5250-cpufreq.c:150: error: 'EXYNOS5_CLKDIV_CPU1' undeclared (first use in this function) drivers/cpufreq/exynos5250-cpufreq.c:152: error: 'EXYNOS5_CLKDIV_STATCPU1' undeclared (first use in this function) drivers/cpufreq/exynos5250-cpufreq.c: In function 'set_apll': drivers/cpufreq/exynos5250-cpufreq.c:166: error: 'EXYNOS5_CLKMUX_STATCPU' undeclared (first use in this function) drivers/cpufreq/exynos5250-cpufreq.c:173: error: 'EXYNOS5_APLL_LOCK' undeclared (first use in this function) drivers/cpufreq/exynos5250-cpufreq.c: In function 'exynos5250_cpufreq_init': drivers/cpufreq/exynos5250-cpufreq.c:312: error: 'EXYNOS5_CLKDIV_CPU1' undeclared (first use in this function) Cc: Jaecheol Lee <jc.lee@samsung.com> Signed-off-by: Jonghwan Choi <jhbird.choi@samsung.com> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2012-04-08 17:14:53 -07:00
Tushar Behera	0d923490f7	ARM: EXYNOS: Add missing definition for IRQ_I2S0 This fixes following build error when sound support is selected on EXYNOS4 platform. sound/soc/samsung/idma.c: In function ‘idma_close’: sound/soc/samsung/idma.c:327:11: error: ‘IRQ_I2S0’ undeclared (first use in this function) Signed-off-by: Tushar Behera <tushar.behera@linaro.org> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2012-04-08 17:14:52 -07:00
Kukjin Kim	7049932920	ARM: S5PV210: fix unused LDO supply field from wm8994_pdata According to commit 719a4240("mfd: Remove unused LDO supply field from WM8994 pdata"), the LDO supply field should be removed from the initializer. Cc: Mark Brown <broonie@opensource.wolfsonmicro.com> Signed-off-by: Kukjin Kim <kgene.kim@samsung.com>	2012-04-08 17:14:52 -07:00
Takuya Yoshikawa	1e3f42f03c	KVM: MMU: Improve iteration through sptes from rmap Iteration using rmap_next(), the actual body is pte_list_next(), is inefficient: every time we call it we start from checking whether rmap holds a single spte or points to a descriptor which links more sptes. In the case of shadow paging, this quadratic total iteration cost is a problem. Even for two dimensional paging, with EPT/NPT on, in which we almost always have a single mapping, the extra checks at the end of the iteration should be eliminated. This patch fixes this by introducing rmap_iterator which keeps the iteration context for the next search. Furthermore the implementation of rmap_next() is splitted into two functions, rmap_get_first() and rmap_get_next(), to avoid repeatedly checking whether the rmap being iterated on has only one spte. Although there seemed to be only a slight change for EPT/NPT, the actual improvement was significant: we observed that GET_DIRTY_LOG for 1GB dirty memory became 15% faster than before. This is probably because the new code is easy to make branch predictions. Note: we just remove pte_list_next() because we can think of parent_ptes as a reverse mapping. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 16:08:27 +03:00
Takuya Yoshikawa	220f773a00	KVM: MMU: Make pte_list_desc fit cache lines well We have PTE_LIST_EXT + 1 pointers in this structure and these 40/20 bytes do not fit cache lines well. Furthermore, some allocators may use 64/32-byte objects for the pte_list_desc cache. This patch solves this problem by changing PTE_LIST_EXT from 4 to 3. For shadow paging, the new size is still large enough to hold both the kernel and process mappings for usual anonymous pages. For file mappings, there may be a slight change in the cache usage. Note: with EPT/NPT we almost always have a single spte in each reverse mapping and we will not see any change by this. Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 16:08:25 +03:00
Davidlohr Bueso	c36fc04ef5	KVM: x86: add paging gcc optimization Since most guests will have paging enabled for memory management, add likely() optimization around CR0.PG checks. Signed-off-by: Davidlohr Bueso <dave@gnu.org> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:03:13 +03:00
Josh Triplett	e9bda3b3d0	KVM: VMX: Auto-load on CPUs with VMX Enable x86 feature-based autoloading for the kvm-intel module on CPUs with X86_FEATURE_VMX. Signed-off-by: Josh Triplett <josh@joshtriplett.org> Acked-By: Kay Sievers <kay@vrfy.org> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:03:12 +03:00
Benjamin Herrenschmidt	bbcc9c0669	powerpc/kvm: Fix magic page vs. 32-bit RTAS on ppc64 When the kernel calls into RTAS, it switches to 32-bit mode. The magic page was is longer accessible in that case, causing the patched instructions in the RTAS call wrapper to crash. This fixes it by making available a 32-bit mapping of the magic page in that case. This mapping is flushed whenever we switch the kernel back to 64-bit mode. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [agraf: add a check if the magic page is mapped] Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:02:39 +03:00
Alexander Graf	966cd0f3bd	KVM: PPC: Ignore unhalt request from kvm_vcpu_block When running kvm_vcpu_block and it realizes that the CPU is actually good to run, we get a request bit set for KVM_REQ_UNHALT. Right now, there's nothing we can do with that bit, so let's unset it right after the call again so we don't get confused in our later checks for pending work. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:02:38 +03:00
Alexander Graf	4f225ae06e	KVM: PPC: Book3s: PR: Add HV traps so we can run in HV=1 mode on p7 When running PR KVM on a p7 system in bare metal, we get HV exits instead of normal supervisor traps. Semantically they are identical though and the HSRR vs SRR difference is already taken care of in the exit code. So all we need to do is handle them in addition to our normal exits. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:02:00 +03:00
Alexander Graf	6df79df5b2	KVM: PPC: Emulate tw and td instructions There are 4 conditional trapping instructions: tw, twi, td, tdi. The ones with an i take an immediate comparison, the others compare two registers. All of them arrive in the emulator when the condition to trap was successfully fulfilled. Unfortunately, we were only implementing the i versions so far, so let's also add support for the other two. This fixes kernel booting with recents book3s_32 guest kernels. Reported-by: Jörg Sommer <joerg@alea.gnuu.de> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:01:57 +03:00
Alexander Graf	6020c0f6e7	KVM: PPC: Pass EA to updating emulation ops When emulating updating load/store instructions (lwzu, stwu, ...) we need to write the effective address of the load/store into a register. Currently, we write the physical address in there, which is very wrong. So instead let's save off where the virtual fault was on MMIO and use that information as value to put into the register. While at it, also move the XOP variants of the above instructions to the new scheme of using the already known vaddr instead of calculating it themselves. Reported-by: Jörg Sommer <joerg@alea.gnuu.de> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:01:37 +03:00
Paul Mackerras	8943633cf9	KVM: PPC: Work around POWER7 DABR corruption problem It turns out that on POWER7, writing to the DABR can cause a corrupted value to be written if the PMU is active and updating SDAR in continuous sampling mode. To work around this, we make sure that the PMU is inactive and SDAR updates are disabled (via MMCRA) when we are context-switching DABR. When the guest sets DABR via the H_SET_DABR hypercall, we use a slightly different workaround, which is to read back the DABR and write it again if it got corrupted. While we are at it, make it consistent that the saving and restoring of the guest's non-volatile GPRs and the FPRs are done with the guest setup of the PMU active. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:01:36 +03:00
Paul Mackerras	7657f4089b	KVM: PPC: Book 3S: Fix compilation for !HV configs Commits 2f5cdd5487 ("KVM: PPC: Book3S HV: Make secondary threads more robust against stray IPIs") and 1c2066b0f7 ("KVM: PPC: Book3S HV: Make virtual processor area registration more robust") added fields to struct kvm_vcpu_arch inside #ifdef CONFIG_KVM_BOOK3S_64_HV regions, and added lines to arch/powerpc/kernel/asm-offsets.c to generate assembler constants for their offsets. Unfortunately this led to compile errors on Book 3S machines for configs that had KVM enabled but not CONFIG_KVM_BOOK3S_64_HV. This fixes the problem by moving the offending lines inside #ifdef CONFIG_KVM_BOOK3S_64_HV regions. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:01:34 +03:00
Bharat Bhushan	c0fe7b0999	Restore guest CR after exit timing calculation No instruction which can change Condition Register (CR) should be executed after Guest CR is loaded. So the guest CR is restored after the Exit Timing in lightweight_exit executes cmpw, which can clobber CR. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:01:31 +03:00
Paul Mackerras	0456ec4ff2	KVM: PPC: Book3S HV: Report stolen time to guest through dispatch trace log This adds code to measure "stolen" time per virtual core in units of timebase ticks, and to report the stolen time to the guest using the dispatch trace log (DTL). The guest can register an area of memory for the DTL for a given vcpu. The DTL is a ring buffer where KVM fills in one entry every time it enters the guest for that vcpu. Stolen time is measured as time when the virtual core is not running, either because the vcore is not runnable (e.g. some of its vcpus are executing elsewhere in the kernel or in userspace), or when the vcpu thread that is running the vcore is preempted. This includes time when all the vcpus are idle (i.e. have executed the H_CEDE hypercall), which is OK because the guest accounts stolen time while idle as idle time. Each vcpu keeps a record of how much stolen time has been reported to the guest for that vcpu so far. When we are about to enter the guest, we create a new DTL entry (if the guest vcpu has a DTL) and report the difference between total stolen time for the vcore and stolen time reported so far for the vcpu as the "enqueue to dispatch" time in the DTL entry. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:01:29 +03:00
Paul Mackerras	2e25aa5f64	KVM: PPC: Book3S HV: Make virtual processor area registration more robust The PAPR API allows three sorts of per-virtual-processor areas to be registered (VPA, SLB shadow buffer, and dispatch trace log), and furthermore, these can be registered and unregistered for another virtual CPU. Currently we just update the vcpu fields pointing to these areas at the time of registration or unregistration. If this is done on another vcpu, there is the possibility that the target vcpu is using those fields at the time and could end up using a bogus pointer and corrupting memory. This fixes the race by making the target cpu itself do the update, so we can be sure that the update happens at a time when the fields aren't being used. Each area now has a struct kvmppc_vpa which is used to manage these updates. There is also a spinlock which protects access to all of the kvmppc_vpa structs, other than to the pinned_addr fields. (We could have just taken the spinlock when using the vpa, slb_shadow or dtl fields, but that would mean taking the spinlock on every guest entry and exit.) This also changes 'struct dtl' (which was undefined) to 'struct dtl_entry', which is what the rest of the kernel uses. Thanks to Michael Ellerman <michael@ellerman.id.au> for pointing out the need to initialize vcpu->arch.vpa_update_lock. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:01:27 +03:00
Paul Mackerras	f0888f7015	KVM: PPC: Book3S HV: Make secondary threads more robust against stray IPIs Currently on POWER7, if we are running the guest on a core and we don't need all the hardware threads, we do nothing to ensure that the unused threads aren't executing in the kernel (other than checking that they are offline). We just assume they're napping and we don't do anything to stop them trying to enter the kernel while the guest is running. This means that a stray IPI can wake up the hardware thread and it will then try to enter the kernel, but since the core is in guest context, it will execute code from the guest in hypervisor mode once it turns the MMU on, which tends to lead to crashes or hangs in the host. This fixes the problem by adding two new one-byte flags in the kvmppc_host_state structure in the PACA which are used to interlock between the primary thread and the unused secondary threads when entering the guest. With these flags, the primary thread can ensure that the unused secondaries are not already in kernel mode (i.e. handling a stray IPI) and then indicate that they should not try to enter the kernel if they do get woken for any reason. Instead they will go into KVM code, find that there is no vcpu to run, acknowledge and clear the IPI and go back to nap mode. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:01:20 +03:00
Alexander Graf	f6127716c3	KVM: PPC: Save/Restore CR over vcpu_run On PPC, CR2-CR4 are nonvolatile, thus have to be saved across function calls. We didn't respect that for any architecture until Paul spotted it in his patch for Book3S-HV. This patch saves/restores CR for all KVM capable PPC hosts. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 14:01:02 +03:00
Matt Evans	3aaefef200	KVM: PPC: Book3s: PR: Add SPAPR H_BULK_REMOVE support SPAPR support includes various in-kernel hypercalls, improving performance by cutting out the exit to userspace. H_BULK_REMOVE is implemented in this patch. Signed-off-by: Matt Evans <matt@ozlabs.org> Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:55:31 +03:00
Alexander Graf	03660ba270	KVM: PPC: Booke: only prepare to enter when we enter So far, we've always called prepare_to_enter even when all we did was return to the host. This patch changes that semantic to only call prepare_to_enter when we actually want to get back into the guest. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:55:29 +03:00
Alexander Graf	7cc1e8ee78	KVM: PPC: booke: Reinject performance monitor interrupts When we get a performance monitor interrupt, we need to make sure that the host receives it. So reinject it like we reinject the other host destined interrupts. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:55:28 +03:00
Alexander Graf	4e642ccbd6	KVM: PPC: booke: expose good state on irq reinject When reinjecting an interrupt into the host interrupt handler after we're back in host kernel land, we need to tell the kernel where the interrupt happened. We can't tell it that we were in guest state, because that might lead to random code walking host addresses. So instead, we tell it that we came from the interrupt reinject code. This helps getting reasonable numbers out of perf. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:55:26 +03:00
Alexander Graf	95f2e92144	KVM: PPC: booke: Support perfmon interrupts When during guest context we get a performance monitor interrupt, we currently bail out and oops. Let's route it to its correct handler instead. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:55:24 +03:00
Alexander Graf	c6b3733bef	KVM: PPC: e500: fix typo in tlb code The tlbncfg registers should be populated with their respective TLB's values. Fix the obvious typo. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:55:22 +03:00
Alexander Graf	55cdf08b9a	KVM: PPC: bookehv: remove unused code There was some unused code in the exit code path that must have been a leftover from earlier iterations. While it did no harm, it's superfluous and thus should be removed. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:55:21 +03:00
Alexander Graf	0268597c81	KVM: PPC: booke: add GS documentation for program interrupt The comment for program interrupts triggered when using bookehv was misleading. Update it to mention why MSR_GS indicates that we have to inject an interrupt into the guest again, not emulate it. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:55:19 +03:00
Alexander Graf	c35c9d84cf	KVM: PPC: booke: Readd debug abort code for machine check When during guest execution we get a machine check interrupt, we don't know how to handle it yet. So let's add the error printing code back again that we dropped accidently earlier and tell user space that something went really wrong. Signed-off-by: Alexander Graf <agraf@suse.de> Signed-off-by: Avi Kivity <avi@redhat.com>	2012-04-08 12:55:17 +03:00

1 2 3 4 5 ...

67649 Commits