linux

iv/linux

Author	SHA1	Message	Date
Bharat Bhushan	b12c784123	KVM: PPC: E500: exit to user space on "ehpriv 1" instruction "ehpriv 1" instruction is used for setting software breakpoints by user space. This patch adds support to exit to user space with "run->debug" have relevant information. As this is the first point we are using run->debug, also defined the run->debug structure. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:49:39 +02:00
Bharat Bhushan	fc82cf113b	powerpc: export debug registers save function for KVM KVM need this function when switching from vcpu to user-space thread. My subsequent patch will use this function. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Acked-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:49:39 +02:00
Bharat Bhushan	95791988fe	powerpc: move debug registers in a structure This way we can use same data type struct with KVM and also help in using other debug related function. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:49:38 +02:00
Bharat Bhushan	c199efa295	powerpc: book3e: _PAGE_LENDIAN must be _PAGE_ENDIAN For booke3e _PAGE_ENDIAN is not defined. Infact what is defined is "_PAGE_LENDIAN" which is wrong and that should be _PAGE_ENDIAN. There are no compilation errors as arch/powerpc/include/asm/pte-common.h defines _PAGE_ENDIAN to 0 as it is not defined anywhere. Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:49:37 +02:00
Paul Mackerras	44a3add863	KVM: PPC: Book3S HV: Better handling of exceptions that happen in real mode When an interrupt or exception happens in the guest that comes to the host, the CPU goes to hypervisor real mode (MMU off) to handle the exception but doesn't change the MMU context. After saving a few registers, we then clear the "in guest" flag. If, for any reason, we get an exception in the real-mode code, that then gets handled by the normal kernel exception handlers, which turn the MMU on. This is disastrous if the MMU is still set to the guest context, since we end up executing instructions from random places in the guest kernel with hypervisor privilege. In order to catch this situation, we define a new value for the "in guest" flag, KVM_GUEST_MODE_HOST_HV, to indicate that we are in hypervisor real mode with guest MMU context. If the "in guest" flag is set to this value, we branch off to an emergency handler. For the moment, this just does a branch to self to stop the CPU from doing anything further. While we're here, we define another new flag value to indicate that we are in a HV guest, as distinct from a PR guest. This will be useful when we have a kernel that can support both PR and HV guests concurrently. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:49:37 +02:00
Paul Mackerras	d78bca7296	KVM: PPC: Book3S PR: Use mmu_notifier_retry() in kvmppc_mmu_map_page() When the MM code is invalidating a range of pages, it calls the KVM kvm_mmu_notifier_invalidate_range_start() notifier function, which calls kvm_unmap_hva_range(), which arranges to flush all the existing host HPTEs for guest pages. However, the Linux PTEs for the range being flushed are still valid at that point. We are not supposed to establish any new references to pages in the range until the ...range_end() notifier gets called. The PPC-specific KVM code doesn't get any explicit notification of that; instead, we are supposed to use mmu_notifier_retry() to test whether we are or have been inside a range flush notifier pair while we have been getting a page and instantiating a host HPTE for the page. This therefore adds a call to mmu_notifier_retry inside kvmppc_mmu_map_page(). This call is inside a region locked with kvm->mmu_lock, which is the same lock that is called by the KVM MMU notifier functions, thus ensuring that no new notification can proceed while we are in the locked region. Inside this region we also create the host HPTE and link the corresponding hpte_cache structure into the lists used to find it later. We cannot allocate the hpte_cache structure inside this locked region because that can lead to deadlock, so we allocate it outside the region and free it if we end up not using it. This also moves the updates of vcpu3s->hpte_cache_count inside the regions locked with vcpu3s->mmu_lock, and does the increment in kvmppc_mmu_hpte_cache_map() when the pte is added to the cache rather than when it is allocated, in order that the hpte_cache_count is accurate. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:49:35 +02:00
Paul Mackerras	93b159b466	KVM: PPC: Book3S PR: Better handling of host-side read-only pages Currently we request write access to all pages that get mapped into the guest, even if the guest is only loading from the page. This reduces the effectiveness of KSM because it means that we unshare every page we access. Also, we always set the changed (C) bit in the guest HPTE if it allows writing, even for a guest load. This fixes both these problems. We pass an 'iswrite' flag to the mmu.xlate() functions and to kvmppc_mmu_map_page() to indicate whether the access is a load or a store. The mmu.xlate() functions now only set C for stores. kvmppc_gfn_to_pfn() now calls gfn_to_pfn_prot() instead of gfn_to_pfn() so that it can indicate whether we need write access to the page, and get back a 'writable' flag to indicate whether the page is writable or not. If that 'writable' flag is clear, we then make the host HPTE read-only even if the guest HPTE allowed writing. This means that we can get a protection fault when the guest writes to a page that it has mapped read-write but which is read-only on the host side (perhaps due to KSM having merged the page). Thus we now call kvmppc_handle_pagefault() for protection faults as well as HPTE not found faults. In kvmppc_handle_pagefault(), if the access was allowed by the guest HPTE and we thus need to install a new host HPTE, we then need to remove the old host HPTE if there is one. This is done with a new function, kvmppc_mmu_unmap_page(), which uses kvmppc_mmu_pte_vflush() to find and remove the old host HPTE. Since the memslot-related functions require the KVM SRCU read lock to be held, this adds srcu_read_lock/unlock pairs around the calls to kvmppc_handle_pagefault(). Finally, this changes kvmppc_mmu_book3s_32_xlate_pte() to not ignore guest HPTEs that don't permit access, and to return -EPERM for accesses that are not permitted by the page protections. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:49:35 +02:00
Paul Mackerras	3ff955024d	KVM: PPC: Book3S PR: Allocate kvm_vcpu structs from kvm_vcpu_cache This makes PR KVM allocate its kvm_vcpu structs from the kvm_vcpu_cache rather than having them embedded in the kvmppc_vcpu_book3s struct, which is allocated with vzalloc. The reason is to reduce the differences between PR and HV KVM in order to make is easier to have them coexist in one kernel binary. With this, the kvm_vcpu struct has a pointer to the kvmppc_vcpu_book3s struct. The pointer to the kvmppc_book3s_shadow_vcpu struct has moved from the kvmppc_vcpu_book3s struct to the kvm_vcpu struct, and is only present for 32-bit, since it is only used for 32-bit. Signed-off-by: Paul Mackerras <paulus@samba.org> [agraf: squash in compile fix from Aneesh] Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:45:05 +02:00
Paul Mackerras	9308ab8e2d	KVM: PPC: Book3S PR: Make HPT accesses and updates SMP-safe This adds a per-VM mutex to provide mutual exclusion between vcpus for accesses to and updates of the guest hashed page table (HPT). This also makes the code use single-byte writes to the HPT entry when updating of the reference (R) and change (C) bits. The reason for doing this, rather than writing back the whole HPTE, is that on non-PAPR virtual machines, the guest OS might be writing to the HPTE concurrently, and writing back the whole HPTE might conflict with that. Also, real hardware does single-byte writes to update R and C. The new mutex is taken in kvmppc_mmu_book3s_64_xlate() when reading the HPT and updating R and/or C, and in the PAPR HPT update hcalls (H_ENTER, H_REMOVE, etc.). Having the mutex means that we don't need to use a hypervisor lock bit in the HPT update hcalls, and we don't need to be careful about the order in which the bytes of the HPTE are updated by those hcalls. The other change here is to make emulated TLB invalidations (tlbie) effective across all vcpus. To do this we call kvmppc_mmu_pte_vflush for all vcpus in kvmppc_ppc_book3s_64_tlbie(). For 32-bit, this makes the setting of the accessed and dirty bits use single-byte writes, and makes tlbie invalidate shadow HPTEs for all vcpus. With this, PR KVM can successfully run SMP guests. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:45:04 +02:00
Paul Mackerras	c9029c341d	KVM: PPC: Book3S PR: Use 64k host pages where possible Currently, PR KVM uses 4k pages for the host-side mappings of guest memory, regardless of the host page size. When the host page size is 64kB, we might as well use 64k host page mappings for guest mappings of 64kB and larger pages and for guest real-mode mappings. However, the magic page has to remain a 4k page. To implement this, we first add another flag bit to the guest VSID values we use, to indicate that this segment is one where host pages should be mapped using 64k pages. For segments with this bit set we set the bits in the shadow SLB entry to indicate a 64k base page size. When faulting in host HPTEs for this segment, we make them 64k HPTEs instead of 4k. We record the pagesize in struct hpte_cache for use when invalidating the HPTE. For now we restrict the segment containing the magic page (if any) to 4k pages. It should be possible to lift this restriction in future by ensuring that the magic 4k page is appropriately positioned within a host 64k page. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:45:03 +02:00
Paul Mackerras	a4a0f2524a	KVM: PPC: Book3S PR: Allow guest to use 64k pages This adds the code to interpret 64k HPTEs in the guest hashed page table (HPT), 64k SLB entries, and to tell the guest about 64k pages in kvm_vm_ioctl_get_smmu_info(). Guest 64k pages are still shadowed by 4k pages. This also adds another hash table to the four we have already in book3s_mmu_hpte.c to allow us to find all the PTEs that we have instantiated that match a given 64k guest page. The tlbie instruction changed starting with POWER6 to use a bit in the RB operand to indicate large page invalidations, and to use other RB bits to indicate the base and actual page sizes and the segment size. 64k pages came in slightly earlier, with POWER5++. We use one bit in vcpu->arch.hflags to indicate that the emulated cpu supports 64k pages, and another to indicate that it has the new tlbie definition. The KVM_PPC_GET_SMMU_INFO ioctl presents a bit of a problem, because the MMU capabilities depend on which CPU model we're emulating, but it is a VM ioctl not a VCPU ioctl and therefore doesn't get passed a VCPU fd. In addition, commonly-used userspace (QEMU) calls it before setting the PVR for any VCPU. Therefore, as a best effort we look at the first vcpu in the VM and return 64k pages or not depending on its capabilities. We also make the PVR default to the host PVR on recent CPUs that support 1TB segments (and therefore multiple page sizes as well) so that KVM_PPC_GET_SMMU_INFO will include 64k page and 1TB segment support on those CPUs. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:45:03 +02:00
Paul Mackerras	a2d56020d1	KVM: PPC: Book3S PR: Keep volatile reg values in vcpu rather than shadow_vcpu Currently PR-style KVM keeps the volatile guest register values (R0 - R13, CR, LR, CTR, XER, PC) in a shadow_vcpu struct rather than the main kvm_vcpu struct. For 64-bit, the shadow_vcpu exists in two places, a kmalloc'd struct and in the PACA, and it gets copied back and forth in kvmppc_core_vcpu_load/put(), because the real-mode code can't rely on being able to access the kmalloc'd struct. This changes the code to copy the volatile values into the shadow_vcpu as one of the last things done before entering the guest. Similarly the values are copied back out of the shadow_vcpu to the kvm_vcpu immediately after exiting the guest. We arrange for interrupts to be still disabled at this point so that we can't get preempted on 64-bit and end up copying values from the wrong PACA. This means that the accessor functions in kvm_book3s.h for these registers are greatly simplified, and are same between PR and HV KVM. In places where accesses to shadow_vcpu fields are now replaced by accesses to the kvm_vcpu, we can also remove the svcpu_get/put pairs. Finally, on 64-bit, we don't need the kmalloc'd struct at all any more. With this, the time to read the PVR one million times in a loop went from 567.7ms to 575.5ms (averages of 6 values), an increase of about 1.4% for this worse-case test for guest entries and exits. The standard deviation of the measurements is about 11ms, so the difference is only marginally significant statistically. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:45:03 +02:00
Paul Mackerras	388cc6e133	KVM: PPC: Book3S HV: Support POWER6 compatibility mode on POWER7 This enables us to use the Processor Compatibility Register (PCR) on POWER7 to put the processor into architecture 2.05 compatibility mode when running a guest. In this mode the new instructions and registers that were introduced on POWER7 are disabled in user mode. This includes all the VSX facilities plus several other instructions such as ldbrx, stdbrx, popcntw, popcntd, etc. To select this mode, we have a new register accessible through the set/get_one_reg interface, called KVM_REG_PPC_ARCH_COMPAT. Setting this to zero gives the full set of capabilities of the processor. Setting it to one of the "logical" PVR values defined in PAPR puts the vcpu into the compatibility mode for the corresponding architecture level. The supported values are: 0x0f000002 Architecture 2.05 (POWER6) 0x0f000003 Architecture 2.06 (POWER7) 0x0f100003 Architecture 2.06+ (POWER7+) Since the PCR is per-core, the architecture compatibility level and the corresponding PCR value are stored in the struct kvmppc_vcore, and are therefore shared between all vcpus in a virtual core. Signed-off-by: Paul Mackerras <paulus@samba.org> [agraf: squash in fix to add missing break statements and documentation] Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:45:02 +02:00
Paul Mackerras	4b8473c9c1	KVM: PPC: Book3S HV: Add support for guest Program Priority Register POWER7 and later IBM server processors have a register called the Program Priority Register (PPR), which controls the priority of each hardware CPU SMT thread, and affects how fast it runs compared to other SMT threads. This priority can be controlled by writing to the PPR or by use of a set of instructions of the form or rN,rN,rN which are otherwise no-ops but have been defined to set the priority to particular levels. This adds code to context switch the PPR when entering and exiting guests and to make the PPR value accessible through the SET/GET_ONE_REG interface. When entering the guest, we set the PPR as late as possible, because if we are setting a low thread priority it will make the code run slowly from that point on. Similarly, the first-level interrupt handlers save the PPR value in the PACA very early on, and set the thread priority to the medium level, so that the interrupt handling code runs at a reasonable speed. Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:45:02 +02:00
Paul Mackerras	a0144e2a6b	KVM: PPC: Book3S HV: Store LPCR value for each virtual core This adds the ability to have a separate LPCR (Logical Partitioning Control Register) value relating to a guest for each virtual core, rather than only having a single value for the whole VM. This corresponds to what real POWER hardware does, where there is a LPCR per CPU thread but most of the fields are required to have the same value on all active threads in a core. The per-virtual-core LPCR can be read and written using the GET/SET_ONE_REG interface. Userspace can can only modify the following fields of the LPCR value: DPFD Default prefetch depth ILE Interrupt little-endian TC Translation control (secondary HPT hash group search disable) We still maintain a per-VM default LPCR value in kvm->arch.lpcr, which contains bits relating to memory management, i.e. the Virtualized Partition Memory (VPM) bits and the bits relating to guest real mode. When this default value is updated, the update needs to be propagated to the per-vcore values, so we add a kvmppc_update_lpcr() helper to do that. Signed-off-by: Paul Mackerras <paulus@samba.org> [agraf: fix whitespace] Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:45:01 +02:00
Paul Mackerras	c0867fd509	KVM: PPC: Book3S: Add GET/SET_ONE_REG interface for VRSAVE The VRSAVE register value for a vcpu is accessible through the GET/SET_SREGS interface for Book E processors, but not for Book 3S processors. In order to make this accessible for Book 3S processors, this adds a new register identifier for GET/SET_ONE_REG, and adds the code to implement it. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:44:59 +02:00
Paul Mackerras	93b0f4dc29	KVM: PPC: Book3S HV: Implement timebase offset for guests This allows guests to have a different timebase origin from the host. This is needed for migration, where a guest can migrate from one host to another and the two hosts might have a different timebase origin. However, the timebase seen by the guest must not go backwards, and should go forwards only by a small amount corresponding to the time taken for the migration. Therefore this provides a new per-vcpu value accessed via the one_reg interface using the new KVM_REG_PPC_TB_OFFSET identifier. This value defaults to 0 and is not modified by KVM. On entering the guest, this value is added onto the timebase, and on exiting the guest, it is subtracted from the timebase. This is only supported for recent POWER hardware which has the TBU40 (timebase upper 40 bits) register. Writing to the TBU40 register only alters the upper 40 bits of the timebase, leaving the lower 24 bits unchanged. This provides a way to modify the timebase for guest migration without disturbing the synchronization of the timebase registers across CPU cores. The kernel rounds up the value given to a multiple of 2^24. Timebase values stored in KVM structures (struct kvm_vcpu, struct kvmppc_vcore, etc.) are stored as host timebase values. The timebase values in the dispatch trace log need to be guest timebase values, however, since that is read directly by the guest. This moves the setting of vcpu->arch.dec_expires on guest exit to a point after we have restored the host timebase so that vcpu->arch.dec_expires is a host timebase value. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:44:59 +02:00
Paul Mackerras	14941789f2	KVM: PPC: Book3S HV: Save/restore SIAR and SDAR along with other PMU registers Currently we are not saving and restoring the SIAR and SDAR registers in the PMU (performance monitor unit) on guest entry and exit. The result is that performance monitoring tools in the guest could get false information about where a program was executing and what data it was accessing at the time of a performance monitor interrupt. This fixes it by saving and restoring these registers along with the other PMU registers on guest entry/exit. This also provides a way for userspace to access these values for a vcpu via the one_reg interface. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:44:59 +02:00
Michael Neuling	3b7834743f	KVM: PPC: Book3S HV: Reserve POWER8 space in get/set_one_reg This reserves space in get/set_one_reg ioctl for the extra guest state needed for POWER8. It doesn't implement these at all, it just reserves them so that the ABI is defined now. A few things to note here: - This add a lot state for transactional memory. TM suspend mode, this is unavoidable, you can't simply roll back all transactions and store only the checkpointed state. I've added this all to get/set_one_reg (including GPRs) rather than creating a new ioctl which returns a struct kvm_regs like KVM_GET_REGS does. This means we if we need to extract the TM state, we are going to need a bucket load of IOCTLs. Hopefully most of the time this will not be needed as we can look at the MSR to see if TM is active and only grab them when needed. If this becomes a bottle neck in future we can add another ioctl to grab all this state in one go. - The TM state is offset by 0x80000000. - For TM, I've done away with VMX and FP and created a single 64x128 bit VSX register space. - I've left a space of 1 (at 0x9c) since Paulus needs to add a value which applies to POWER7 as well. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Alexander Graf <agraf@suse.de>	2013-10-17 14:44:59 +02:00
James Yang	9863c28a2a	powerpc: Emulate sync instruction variants Reserved fields of the sync instruction have been used for other instructions (e.g. lwsync). On processors that do not support variants of the sync instruction, emulate it by executing a sync to subsume the effect of the intended instruction. Signed-off-by: James Yang <James.Yang@freescale.com> [scottwood@freescale.com: whitespace and subject line fix] Signed-off-by: Scott Wood <scottwood@freescale.com>	2013-10-16 18:51:18 -05:00
Christoffer Dall	2c5350e934	KVM: PPC: Get rid of KVM_HPAGE defines Now when the main kvm code relying on these defines has been moved to the x86 specific part of the world, we can get rid of these. Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org> Signed-off-by: Gleb Natapov <gleb@redhat.com>	2013-10-14 10:12:11 +03:00
Benjamin Herrenschmidt	3ad26e5c44	Merge branch 'for-kvm' into next Topic branch for commits that the KVM tree might want to pull in separately. Hand merged a few files due to conflicts with the LE stuff Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 18:23:53 +11:00
Paul Mackerras	18461960cb	powerpc: Provide for giveup_fpu/altivec to save state in alternate location This provides a facility which is intended for use by KVM, where the contents of the FP/VSX and VMX (Altivec) registers can be saved away to somewhere other than the thread_struct when kernel code wants to use floating point or VMX instructions. This is done by providing a pointer in the thread_struct to indicate where the state should be saved to. The giveup_fpu() and giveup_altivec() functions test these pointers and save state to the indicated location if they are non-NULL. Note that the MSR_FP/VEC bits in task->thread.regs->msr are still used to indicate whether the CPU register state is live, even when an alternate save location is being used. This also provides load_fp_state() and load_vr_state() functions, which load up FP/VSX and VMX state from memory into the CPU registers, and corresponding store_fp_state() and store_vr_state() functions, which store FP/VSX and VMX state into memory from the CPU registers. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 17:26:50 +11:00
Paul Mackerras	de79f7b9f6	powerpc: Put FP/VSX and VR state into structures This creates new 'thread_fp_state' and 'thread_vr_state' structures to store FP/VSX state (including FPSCR) and Altivec/VSX state (including VSCR), and uses them in the thread_struct. In the thread_fp_state, the FPRs and VSRs are represented as u64 rather than double, since we rarely perform floating-point computations on the values, and this will enable the structures to be used in KVM code as well. Similarly FPSCR is now a u64 rather than a structure of two 32-bit values. This takes the offsets out of the macros such as SAVE_32FPRS, REST_32FPRS, etc. This enables the same macros to be used for normal and transactional state, enabling us to delete the transactional versions of the macros. This also removes the unused do_load_up_fpu and do_load_up_altivec, which were in fact buggy since they didn't create large enough stack frames to account for the fact that load_up_fpu and load_up_altivec are not designed to be called from C and assume that their caller's stack frame is an interrupt frame. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 17:26:49 +11:00
Alexey Kardashevskiy	8e0a1611cb	powerpc: add real mode support for dma operations on powernv The existing TCE machine calls (tce_build and tce_free) only support virtual mode as they call __raw_writeq for TCE invalidation what fails in real mode. This introduces tce_build_rm and tce_free_rm real mode versions which do mostly the same but use "Store Doubleword Caching Inhibited Indexed" instruction for TCE invalidation. This new feature is going to be utilized by real mode support of VFIO. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 17:24:40 +11:00
Alexey Kardashevskiy	8e0861fa3c	powerpc: Prepare to support kernel handling of IOMMU map/unmap The current VFIO-on-POWER implementation supports only user mode driven mapping, i.e. QEMU is sending requests to map/unmap pages. However this approach is really slow, so we want to move that to KVM. Since H_PUT_TCE can be extremely performance sensitive (especially with network adapters where each packet needs to be mapped/unmapped) we chose to implement that as a "fast" hypercall directly in "real mode" (processor still in the guest context but MMU off). To be able to do that, we need to provide some facilities to access the struct page count within that real mode environment as things like the sparsemem vmemmap mappings aren't accessible. This adds an API function realmode_pfn_to_page() to get page struct when MMU is off. This adds to MM a new function put_page_unless_one() which drops a page if counter is bigger than 1. It is going to be used when MMU is off (for example, real mode on PPC64) and we want to make sure that page release will not happen in real mode as it may crash the kernel in a horrible way. CONFIG_SPARSEMEM_VMEMMAP and CONFIG_FLATMEM are supported. Cc: linux-mm@kvack.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 17:24:39 +11:00
Gavin Shan	8c6852e036	powerpc/eeh: Output PHB3 diag-data The patch adds function ioda_eeh_phb3_phb_diag() to dump PHB3 PHB diag-data. That's called while detecting informative errors or frozen PE on the specific PHB. Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:54:12 +11:00
Benjamin Herrenschmidt	aaa63093dd	powerpc/scom: Change scom_read() and scom_write() to return errors scom_read() now returns the read value via a pointer argument and both functions return an int error code Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:53:34 +11:00
Benjamin Herrenschmidt	ac237b65f5	powerpc: Enable /dev/port when isa_io_special is set isa_io_special is set when the platform provides a "special" implementation of inX/outX via some FW interface for example. Such a platform doesn't need an ISA bridge on PCI, and so /dev/port should be made available even if one isn't present. This makes the LPC bus IOs accessible via /dev/port on PowerNV Power8 Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:53:30 +11:00
Michael Ellerman	a4da0d50b2	powerpc: Implement arch_get_random_long/int() for powernv Add the plumbing to implement arch_get_random_long/int(). It didn't seem worth adding an extra ppc_md hook for int, so we reuse the one for long. Add an implementation for powernv based on the hwrng found in power7+ systems. We whiten the output of the hwrng, and the result passes all the dieharder tests. Signed-off-by: Michael Ellerman <michael@ellerman.id.au> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:50:19 +11:00
Benjamin Herrenschmidt	99fc1d91b8	powerpc/hvsi: Fix endian issues in HVSI driver Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:53 +11:00
Anton Blanchard	5e4da530a5	powerpc/powernv: Fix some PCI sparse errors and one LE bug pnv_pci_setup_bml_iommu was missing a byteswap of a device tree property. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:52 +11:00
Anton Blanchard	6feff6d4a5	powerpc/powernv: More little endian issues in OPAL RTC driver Sparse caught an issue where opal_set_rtc_time was incorrectly byteswapping. Also fix a number of sparse warnings. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:51 +11:00
Benjamin Herrenschmidt	4f89363b11	powerpc/powernv: Fix endian issues in OPAL console and udbg backend Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:48 +11:00
Anton Blanchard	de577a3568	powerpc: Use generic memcpy code in little endian We need to fix some endian issues in our memcpy code. For now just enable the generic memcpy routine for little endian builds. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:40 +11:00
Anton Blanchard	7a332b0c9a	powerpc: Use generic checksum code in little endian We need to fix some endian issues in our checksum code. For now just enable the generic checksum routines for little endian builds. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:39 +11:00
Benjamin Herrenschmidt	5c0484e25e	powerpc: Endian safe trampoline Create a trampoline that works in either endian and flips to the expected endian. Use it for primary and secondary thread entry as well as RTAS and OF call return. Credit for finding the magic instruction goes to Paul Mackerras Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:34 +11:00
Ian Munsie	c57a5ce0df	powerpc: Include the appropriate endianness header This patch will have powerpc include the appropriate generic endianness header depending on what the compiler reports. Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:33 +11:00
Anton Blanchard	ef1967ff87	powerpc: Set MSR_LE bit on little endian builds We need to set MSR_LE in kernel and userspace for little endian builds Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:31 +11:00
Anton Blanchard	4c74c330c2	powerpc: Add little endian support for word-at-a-time functions The powerpc word-at-a-time functions are big endian specific. Bring in the x86 version in order to support little endian builds. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:30 +11:00
Ian Munsie	15cba23e69	powerpc: Support endian agnostic MMIO This patch maps the MMIO functions for 32bit PowerPC to their appropriate instructions depending on CPU endianness. The macros used to create the corresponding inline functions are also renamed by this patch. Previously they had BE or LE in their names which was misleading - they had nothing to do with endianness, but actually created different instruction forms so their new names reflect the instruction form they are creating (D-Form and X-Form). Little endian 64bit PowerPC is not supported, so the lack of mappings (and corresponding breakage) for that case is intentional to bring the attention of anyone doing a 64bit little endian port. 64bit big endian is unaffected. [ Added 64 bit versions - Anton ] Signed-off-by: Ian Munsie <imunsie@au1.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:29 +11:00
Anton Blanchard	926f160f46	powerpc: Little endian builds double word swap VSX state during context save/restore The elements within VSX loads and stores are big endian ordered regardless of endianness. Our VSX context save/restore code uses lxvd2x and stxvd2x which is a 2x doubleword operation. This means the two doublewords will be swapped and we have to perform another swap to undo it. We need to do this on save and restore. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:28 +11:00
Anton Blanchard	e156bd8ad7	powerpc: Fix offset of FPRs in VSX registers in little endian builds The FPRs overlap the high doublewords of the first 32 VSX registers. Fix TS_FPROFFSET and TS_VSRLOWOFFSET so we access the correct fields in little endian mode. If VSX is disabled the FPRs are only one doubleword in length so TS_FPROFFSET needs adjusting in little endian. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:26 +11:00
Anton Blanchard	12f04f2be8	powerpc: Book 3S MMU little endian support Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2013-10-11 16:48:26 +11:00
Ingo Molnar	ec0ad3d01f	Merge branch 'core/urgent' into sched/core Merge in asm goto fix, to be able to apply the asm/rmwcc.h fix. Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-11 07:39:37 +02:00
Ingo Molnar	3f0116c323	compiler/gcc4: Add quirk for 'asm goto' miscompilation bug Fengguang Wu, Oleg Nesterov and Peter Zijlstra tracked down a kernel crash to a GCC bug: GCC miscompiles certain 'asm goto' constructs, as outlined here: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58670 Implement a workaround suggested by Jakub Jelinek. Reported-and-tested-by: Fengguang Wu <fengguang.wu@intel.com> Reported-by: Oleg Nesterov <oleg@redhat.com> Reported-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Suggested-by: Jakub Jelinek <jakub@redhat.com> Reviewed-by: Richard Henderson <rth@twiddle.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: <stable@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org>	2013-10-11 07:39:14 +02:00
Rob Herring	5c19c5c6d4	powerpc: clean-up include ordering in prom.h Now that the core OF headers don't depend on prom.h, rearrange the includes. There are still lots of implicit includes in the powerpc tree, so the includes of OF headers are still necessary. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Grant Likely <grant.likely@linaro.org>	2013-10-09 20:04:11 -05:00
Rob Herring	d0dfa16a60	of: move of_translate_dma_address to of_address.h of_translate_dma_address is implemented in common code, so move the declaration there too. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Grant Likely <grant.likely@linaro.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: linuxppc-dev@lists.ozlabs.org	2013-10-09 20:04:10 -05:00
Rob Herring	32df8dca50	of: remove HAVE_ARCH_DEVTREE_FIXUPS HAVE_ARCH_DEVTREE_FIXUPS appears to always be needed except for sparc, but it is only used for /proc/device-teee and sparc does not enable /proc/device-tree. So this option is redundant. Remove the option and always enable it. This has the side effect of fixing /proc/device-tree on arches such as arm64 which failed to define this option. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Vineet Gupta <vgupta@synopsys.com> Acked-by: Grant Likely <grant.likely@linaro.org> Cc: Russell King <linux@arm.linux.org.uk> Cc: James Hogan <james.hogan@imgtec.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Jonas Bonn <jonas@southpole.se> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: Chris Zankel <chris@zankel.net> Cc: Max Filippov <jcmvbkbc@gmail.com>	2013-10-09 20:04:08 -05:00
Rob Herring	0c3f061c19	of: implement of_node_to_nid as a weak function Implement of_node_to_nid as weak function to remove the dependency on asm/prom.h. This is in preparation to make prom.h optional. Signed-off-by: Rob Herring <rob.herring@calxeda.com> Acked-by: Grant Likely <grant.likely@linaro.org>	2013-10-09 20:04:07 -05:00

1 2 3 4 5 ...

2202 Commits