4080 Commits

Author SHA1 Message Date
Gerhard Sittig
7d71d5b2e8 clk: mpc5xxx: switch to COMMON_CLK, retire PPC_CLOCK
the setup before the change was
- arch/powerpc/Kconfig had the PPC_CLOCK option, off by default
- depending on the PPC_CLOCK option the arch/powerpc/kernel/clock.c file
  was built, which implements the clk.h API but always returns -ENOSYS
  unless a platform registers specific callbacks
- the MPC52xx platform selected PPC_CLOCK but did not register any
  callbacks, thus all clk.h API calls keep resulting in -ENOSYS errors
  (which is OK, all peripheral drivers deal with the situation)
- the MPC512x platform selected PPC_CLOCK and registered specific
  callbacks implemented in arch/powerpc/platforms/512x/clock.c, thus
  provided real support for the clock API
- no other powerpc platform did select PPC_CLOCK

the situation after the change is
- the MPC512x platform implements the COMMON_CLK interface, and thus the
  PPC_CLOCK approach in arch/powerpc/platforms/512x/clock.c has become
  obsolete
- the MPC52xx platform still lacks genuine support for the clk.h API
  while this is not a change against the previous situation (the error
  code returned from COMMON_CLK stubs differs but every call still
  results in an error)
- with all references gone, the arch/powerpc/kernel/clock.c wrapper and
  the PPC_CLOCK option have become obsolete, as did the clk_interface.h
  header file

the switch from PPC_CLOCK to COMMON_CLK is done for all platforms within
the same commit such that multiplatform kernels (the combination of 512x
and 52xx within one executable) keep working

Cc: Mike Turquette <mturquette@linaro.org>
Cc: Anatolij Gustschin <agust@denx.de>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Gerhard Sittig <gsi@denx.de>
Signed-off-by: Anatolij Gustschin <agust@denx.de>
2014-01-12 18:53:04 +01:00
Diana Craciun
ed2ddc56e7 powerpc: Replaced tlbilx with tlbwe in the initialization code
On Freescale e6500 cores EPCR[DGTMI] controls whether guest supervisor
state can execute TLB management instructions. If EPCR[DGTMI]=0
tlbwe and tlbilx are allowed to execute normally in the guest state.

A hypervisor may choose to virtualize TLB1 and for this purpose it
may use IPROT to protect the entries for being invalidated by the
guest. However, because tlbwe and tlbilx execution in the guest state
are sharing the same bit, it is not possible to have a scenario where
tlbwe is allowed to be executed in guest state and tlbilx traps. When
guest TLB management instructions are allowed to be executed in guest
state the guest cannot use tlbilx to invalidate TLB1 guest entries.

Linux is using tlbilx in the boot code to invalidate the temporary
entries it creates when initializing the MMU. The patch is replacing
the usage of tlbilx in initialization code with tlbwe with VALID bit
cleared.

Linux is also using tlbilx in other contexts (like huge pages or
indirect entries) but removing the tlbilx from the initialization code
offers the possibility to have scenarios under hypervisor which are
not using huge pages or indirect entries.

Signed-off-by: Diana Craciun <Diana.Craciun@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-10 17:34:04 -06:00
Scott Wood
28efc35fe6 powerpc/e6500: TLB miss handler with hardware tablewalk support
There are a few things that make the existing hw tablewalk handlers
unsuitable for e6500:

 - Indirect entries go in TLB1 (though the resulting direct entries go in
   TLB0).

 - It has threads, but no "tlbsrx." -- so we need a spinlock and
   a normal "tlbsx".  Because we need this lock, hardware tablewalk
   is mandatory on e6500 unless we want to add spinlock+tlbsx to
   the normal bolted TLB miss handler.

 - TLB1 has no HES (nor next-victim hint) so we need software round robin
   (TODO: integrate this round robin data with hugetlb/KVM)

 - The existing tablewalk handlers map half of a page table at a time,
   because IBM hardware has a fixed 1MiB indirect page size.  e6500
   has variable size indirect entries, with a minimum of 2MiB.
   So we can't do the half-page indirect mapping, and even if we
   could it would be less efficient than mapping the full page.

 - Like on e5500, the linear mapping is bolted, so we don't need the
   overhead of supporting nested tlb misses.

Note that hardware tablewalk does not work in rev1 of e6500.
We do not expect to support e6500 rev1 in mainline Linux.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Mihai Caraman <mihai.caraman@freescale.com>
2014-01-09 17:52:19 -06:00
Kevin Hao
0be7d969b0 powerpc/fsl_booke: smp support for booting a relocatable kernel above 64M
When booting above the 64M for a secondary cpu, we also face the
same issue as the boot cpu that the PAGE_OFFSET map two different
physical address for the init tlb and the final map. So we have to use
switch_to_as1/restore_to_as0 between the conversion of these two
maps. When restoring to as0 for a secondary cpu, we only need to
return to the caller. So add a new parameter for function
restore_to_as0 for this purpose.

Use LOAD_REG_ADDR_PIC to get the address of variables which may
be used before we set the final map in cams for the secondary cpu.
Move the setting of cams a bit earlier in order to avoid the
unnecessary using of LOAD_REG_ADDR_PIC.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-09 17:52:18 -06:00
Kevin Hao
7d2471f9fa powerpc/fsl_booke: make sure PAGE_OFFSET map to memstart_addr for relocatable kernel
This is always true for a non-relocatable kernel. Otherwise the kernel
would get stuck. But for a relocatable kernel, it seems a little
complicated. When booting a relocatable kernel, we just align the
kernel start addr to 64M and map the PAGE_OFFSET from there. The
relocation will base on this virtual address. But if this address
is not the same as the memstart_addr, we will have to change the
map of PAGE_OFFSET to the real memstart_addr and do another relocation
again.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
[scottwood@freescale.com: make offset long and non-negative in simple case]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-09 17:52:17 -06:00
Kevin Hao
b27652dd21 powerpc: introduce early_get_first_memblock_info
For a relocatable kernel since it can be loaded at any place, there
is no any relation between the kernel start addr and the memstart_addr.
So we can't calculate the memstart_addr from kernel start addr. And
also we can't wait to do the relocation after we get the real
memstart_addr from device tree because it is so late. So introduce
a new function we can use to get the first memblock address and size
in a very early stage (before machine_init).

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-09 17:52:17 -06:00
Kevin Hao
78a235efdc powerpc/fsl_booke: set the tlb entry for the kernel address in AS1
We use the tlb1 entries to map low mem to the kernel space. In the
current code, it assumes that the first tlb entry would cover the
kernel image. But this is not true for some special cases, such as
when we run a relocatable kernel above the 64M or set
CONFIG_KERNEL_START above 64M. So we choose to switch to address
space 1 before setting these tlb entries.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-09 17:52:16 -06:00
Kevin Hao
dd189692d4 powerpc: enable the relocatable support for the fsl booke 32bit kernel
This is based on the codes in the head_44x.S. The difference is that
the init tlb size we used is 64M. With this patch we can only load the
kernel at address between memstart_addr ~ memstart_addr + 64M. We will
fix this restriction in the following patches.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-09 17:52:16 -06:00
Kevin Hao
99739611e8 powerpc/fsl_booke: introduce get_phys_addr function
Move the codes which translate a effective address to physical address
to a separate function. So it can be reused by other code.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-09 17:52:15 -06:00
Kevin Hao
7c732cba3d powerpc/fsl_booke: protect the access to MAS7
The e500v1 doesn't implement the MAS7, so we should avoid to access
this register on that implementations. In the current kernel, the
access to MAS7 are protected by either CONFIG_PHYS_64BIT or
MMU_FTR_BIG_PHYS. Since some code are executed before the code
patching, we have to use CONFIG_PHYS_64BIT in these cases.

Signed-off-by: Kevin Hao <haokexin@gmail.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-09 17:52:14 -06:00
Wang Dongsheng
a7189483f0 powerpc/85xx: add sysfs for pw20 state and altivec idle
Add a sys interface to enable/diable pw20 state or altivec idle, and
control the wait entry time.

Enable/Disable interface:
    0, disable. 1, enable.
    /sys/devices/system/cpu/cpuX/pw20_state
    /sys/devices/system/cpu/cpuX/altivec_idle

Set wait time interface:(Nanosecond)
    /sys/devices/system/cpu/cpuX/pw20_wait_time
    /sys/devices/system/cpu/cpuX/altivec_idle_wait_time
Example: Base on TBfreq is 41MHZ.
    1~48(ns): TB[63]
    49~97(ns): TB[62]
    98~195(ns): TB[61]
    196~390(ns): TB[60]
    391~780(ns): TB[59]
    781~1560(ns): TB[58]
    ...

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
[scottwood@freescale.com: change ifdef]
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-09 17:51:38 -06:00
Paul Mackerras
595e4f7e69 KVM: PPC: Book3S HV: Use load/store_fp_state functions in HV guest entry/exit
This modifies kvmppc_load_fp and kvmppc_save_fp to use the generic
FP/VSX and VMX load/store functions instead of open-coding the
FP/VSX/VMX load/store instructions.  Since kvmppc_load/save_fp don't
follow C calling conventions, we make them private symbols within
book3s_hv_rmhandlers.S.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:15:03 +01:00
Paul Mackerras
efff191223 KVM: PPC: Store FP/VSX/VMX state in thread_fp/vr_state structures
This uses struct thread_fp_state and struct thread_vr_state to store
the floating-point, VMX/Altivec and VSX state, rather than flat arrays.
This makes transferring the state to/from the thread_struct simpler
and allows us to unify the get/set_one_reg implementations for the
VSX registers.

Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:15:00 +01:00
Bharat Bhushan
1820a8d216 kvm/powerpc: rename kvm_hypercall() to epapr_hypercall()
kvm_hypercall() have nothing KVM specific, so renamed to epapr_hypercall().
Also this in moved to arch/powerpc/include/asm/epapr_hcalls.h

Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2014-01-09 10:14:56 +01:00
Wang Dongsheng
1d47ddf7c3 powerpc/85xx: add hardware automatically enter pw20 state
Using hardware features make core automatically enter PW20 state.
Set a TB count to hardware, the effective count begins when PW10
is entered. When the effective period has expired, the core will
proceed from PW10 to PW20 if no exit conditions have occurred during
the period.

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-07 19:40:28 -06:00
Wang Dongsheng
202e059ce3 powerpc/85xx: add hardware automatically enter altivec idle state
Each core's AltiVec unit may be placed into a power savings mode
by turning off power to the unit. Core hardware will automatically
power down the AltiVec unit after no AltiVec instructions have
executed in N cycles. The AltiVec power-control is triggered by hardware.

Signed-off-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-07 19:39:48 -06:00
Scott Wood
b58a7bd6df powerpc/fsl-booke: Use SPRN_SPRGn rather than mfsprg/mtsprg
This fixes a build break that was probably introduced with the removal
of -Wa,-me500 (commit f49596a4cf4753d13951608f24f939a59fdcc653), where
the assembler refuses to recognize SPRG4-7 with a generic PPC target.

Signed-off-by: Scott Wood <scottwood@freescale.com>
Cc: Dongsheng Wang <dongsheng.wang@freescale.com>
Cc: Anton Vorontsov <avorontsov@mvista.com>
Reviewed-by: Wang Dongsheng <dongsheng.wang@freescale.com>
Tested-by: Wang Dongsheng <dongsheng.wang@freescale.com>
2014-01-07 19:06:03 -06:00
Joseph Myers
640e922501 powerpc: fix exception clearing in e500 SPE float emulation
The e500 SPE floating-point emulation code clears existing exceptions
(__FPU_FPSCR &= ~FP_EX_MASK;) before ORing in the exceptions from the
emulated operation.  However, these exception bits are the "sticky",
cumulative exception bits, and should only be cleared by the user
program setting SPEFSCR, not implicitly by any floating-point
instruction (whether executed purely by the hardware or emulated).
The spurious clearing of these bits shows up as missing exceptions in
glibc testing.

Fixing this, however, is not as simple as just not clearing the bits,
because while the bits may be from previous floating-point operations
(in which case they should not be cleared), the processor can also set
the sticky bits itself before the interrupt for an exception occurs,
and this can happen in cases when IEEE 754 semantics are that the
sticky bit should not be set.  Specifically, the "invalid" sticky bit
is set in various cases with non-finite operands, where IEEE 754
semantics do not involve raising such an exception, and the
"underflow" sticky bit is set in cases of exact underflow, whereas
IEEE 754 semantics are that this flag is set only for inexact
underflow.  Thus, for correct emulation the kernel needs to know the
setting of these two sticky bits before the instruction being
emulated.

When a floating-point operation raises an exception, the kernel can
note the state of the sticky bits immediately afterwards.  Some
<fenv.h> functions that affect the state of these bits, such as
fesetenv and feholdexcept, need to use prctl with PR_GET_FPEXC and
PR_SET_FPEXC anyway, and so it is natural to record the state of those
bits during that call into the kernel and so avoid any need for a
separate call into the kernel to inform it of a change to those bits.
Thus, the interface I chose to use (in this patch and the glibc port)
is that one of those prctl calls must be made after any userspace
change to those sticky bits, other than through a floating-point
operation that traps into the kernel anyway.  feclearexcept and
fesetexceptflag duly make those calls, which would not be required
were it not for this issue.

The previous EGLIBC port, and the uClibc code copied from it, is
fundamentally broken as regards any use of prctl for floating-point
exceptions because it didn't use the PR_FP_EXC_SW_ENABLE bit in its
prctl calls (and did various worse things, such as passing a pointer
when prctl expected an integer).  If you avoid anything where prctl is
used, the clearing of sticky bits still means it will never give
anything approximating correct exception semantics with existing
kernels.  I don't believe the patch makes things any worse for
existing code that doesn't try to inform the kernel of changes to
sticky bits - such code may get incorrect exceptions in some cases,
but it would have done so anyway in other cases.

Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-07 18:32:21 -06:00
Mihai Caraman
228b1a4730 powerpc/booke64: Add LRAT error exception handler
LRAT (Logical to Real Address Translation) present in MMU v2 provides hardware
translation from a logical page number (LPN) to a real page number (RPN) when
tlbwe is executed by a guest or when a page table translation occurs from a
guest virtual address.

Add LRAT error exception handler to Booke3E 64-bit kernel and the basic KVM
handler to avoid build breakage. This is a prerequisite for KVM LRAT support
that will follow.

Signed-off-by: Mihai Caraman <mihai.caraman@freescale.com>
Signed-off-by: Scott Wood <scottwood@freescale.com>
2014-01-07 18:15:29 -06:00
Linus Torvalds
6e4c61968b Merge branch 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc
Pull powerpc fixes from Ben Herrenschmidt:
 "A bit more endian problems found during testing of 3.13 and a few
  other simple fixes and regressions fixes"

* 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
  powerpc: Fix alignment of secondary cpu spin vars
  powerpc: Align p_end
  powernv/eeh: Add buffer for P7IOC hub error data
  powernv/eeh: Fix possible buffer overrun in ioda_eeh_phb_diag()
  powerpc: Make 64-bit non-VMX __copy_tofrom_user bi-endian
  powerpc: Make unaligned accesses endian-safe for powerpc
  powerpc: Fix bad stack check in exception entry
  powerpc/512x: dts: disable MPC5125 usb module
  powerpc/512x: dts: remove misplaced IRQ spec from 'soc' node (5125)
2013-12-30 10:22:57 -08:00
Benjamin Herrenschmidt
dece8ada99 Merge branch 'merge' into next
Merge a pile of fixes that went into the "merge" branch (3.13-rc's) such
as Anton Little Endian fixes.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30 15:19:31 +11:00
Anton Blanchard
a68c33f359 powerpc: Fix endian issues in power7/8 machine check handler
The SLB save area is shared with the hypervisor and is defined
as big endian, so we need to byte swap on little endian builds.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30 14:51:09 +11:00
Alistair Popple
d084775738 powerpc/iommu: Update the generic code to use dynamic iommu page sizes
This patch updates the generic iommu backend code to use the
it_page_shift field to determine the iommu page size instead of
using hardcoded values.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30 14:17:19 +11:00
Alistair Popple
3a553170d3 powerpc/iommu: Add it_page_shift field to determine iommu page size
This patch adds a it_page_shift field to struct iommu_table and
initiliases it to 4K for all platforms.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30 14:17:13 +11:00
Alistair Popple
e589a4404f powerpc/iommu: Update constant names to reflect their hardcoded page size
The powerpc iommu uses a hardcoded page size of 4K. This patch changes
the name of the IOMMU_PAGE_* macros to reflect the hardcoded values. A
future patch will use the existing names to support dynamic page
sizes.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30 14:17:06 +11:00
Mahesh Salgaonkar
4e243b79b0 powerpc: Fix "attempt to move .org backwards" error
With recent machine check patch series changes, The exception vectors
starting from 0x4300 are now overflowing with allyesconfig. Fix that by
moving machine_check_common and machine_check_handle_early code out of
that region to make enough room for exception vector area.

Fixes this build error reportes by Stephen:

arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:958: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:959: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:983: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:984: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1003: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1013: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1014: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1015: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1016: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1017: Error: attempt to move .org backwards
arch/powerpc/kernel/exceptions-64s.S:1018: Error: attempt to move .org backwards

[Moved the code further down as it introduced link errors due to too long
 relative branches to the masked interrupts handlers from the exception
 prologs. Also removed the useless feature section --BenH
]

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Tested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30 14:16:30 +11:00
Olof Johansson
7d4151b509 powerpc: Fix alignment of secondary cpu spin vars
Commit 5c0484e25ec0 ('powerpc: Endian safe trampoline') resulted in
losing proper alignment of the spinlock variables used when booting
secondary CPUs, causing some quite odd issues with failing to boot on
PA Semi-based systems.

This showed itself on ppc64_defconfig, but not on pasemi_defconfig,
so it had gone unnoticed when I initially tested the LE patch set.

Fix is to add explicit alignment instead of relying on good luck. :)

[ It appears that there is a different issue with PA Semi systems
  however this fix is definitely correct so applying anyway -- BenH
]

Fixes: 5c0484e25ec0 ('powerpc: Endian safe trampoline')
Reported-by: Christian Zigotzky <chzigotzky@xenosoft.de>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=67811
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30 14:02:34 +11:00
Anton Blanchard
286e4f90a7 powerpc: Align p_end
p_end is an 8 byte value embedded in the text section. This means it
is only 4 byte aligned when it should be 8 byte aligned. Fix this
by adding an explicit alignment.

This fixes an issue where POWER7 little endian builds with
CONFIG_RELOCATABLE=y fail to boot.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-30 14:02:33 +11:00
Yinghai Lu
fc2798502f PCI: Convert pcibios_resource_to_bus() to take a pci_bus, not a pci_dev
These interfaces:

  pcibios_resource_to_bus(struct pci_dev *dev, *bus_region, *resource)
  pcibios_bus_to_resource(struct pci_dev *dev, *resource, *bus_region)

took a pci_dev, but they really depend only on the pci_bus.  And we want to
use them in resource allocation paths where we have the bus but not a
device, so this patch converts them to take the pci_bus instead of the
pci_dev:

  pcibios_resource_to_bus(struct pci_bus *bus, *bus_region, *resource)
  pcibios_bus_to_resource(struct pci_bus *bus, *resource, *bus_region)

In fact, with standard PCI-PCI bridges, they only depend on the host
bridge, because that's the only place address translation occurs, but
we aren't going that far yet.

[bhelgaas: changelog]
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2013-12-21 10:06:10 -07:00
Linus Torvalds
46dd0835ca The PPC folks had a large amount of changes queued for 3.13, and now they
are fixing the bugs.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJStImwAAoJEBvWZb6bTYbyvR0P/2tH/IuHe7xDaXyWy3JVlmzF
 CmdnOLTPSlQjpLv7BRQ0K5TAU6DZWisRnXGUp1e8+Do4Ho9OuZzJugCr1Lt/4kTA
 kZT2xWP5U4AbLTjoxlVckybk4Ci0oP+iZGqV8d95NurEb1oR1halAZ+7BTqujwch
 jGSd3gk6mVN4np09Bj06P0nddttJubIki1VeZyQUFILqAIkzWv4qyL/awibYCFQA
 +jHEcND8b5D9bkMniMojXaR0BGIdMZOKWGvKUdxbth+FbZgPqzOLwXoCVM5EmuuH
 9aIee65y34+WXT4EHIou5Q4HyDxuKpciv1A7UhwLxEcfgUklvHOV/nZeQAKFIBIt
 uabgHO/Psj6i9qSCuAJX8xYgB+BmktE8d+/r1XmIgQ/gPYRumOl5BVJo6OOIaGrF
 M6cgccPD1dnMzFt4ccxoM1OhJivh30XfHAKKco7i8DhwcHh1cYcYlDqPEOy3wBA5
 i4n99N/5gCSIB87y1EjvDw1CMiJ5PzuialvscH/a4knL9JFuukKS6O+C2z5LULKN
 TixvTZMZWuHdNWezahcjSpbDeqWPBdB8RIEbGi2xBAHU2hsuxV2acjhdQ0vVgP48
 qo8lLiXv4W030y9H+iflg5R6b3tJ5dmNKZN1fYiwhs4ijgL3wOu8iWia57sQFdyD
 Nb+X/MeeD+tD5JYVyqvr
 =k+i/
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "The PPC folks had a large amount of changes queued for 3.13, and now
  they are fixing the bugs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Book3S HV: Don't drop low-order page address bits
  powerpc: book3s: kvm: Don't abuse host r2 in exit path
  powerpc/kvm/booke: Fix build break due to stack frame size warning
  KVM: PPC: Book3S: PR: Enable interrupts earlier
  KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy
  KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu
  KVM: PPC: Book3S: PR: Don't clobber our exit handler id
  powerpc: kvm: fix rare but potential deadlock scene
  KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call
  KVM: PPC: Book3S HV: Make tbacct_lock irq-safe
  KVM: PPC: Book3S HV: Refine barriers in guest entry/exit
  KVM: PPC: Book3S HV: Fix physical address calculations
2013-12-20 12:26:54 -08:00
Paolo Bonzini
5e6d26cf48 Patch queue for 3.13 - 2013-12-18
This fixes some grave issues we've only found after 3.13-rc1:
 
   - Make the modularized HV/PR book3s kvm work well as modules
   - Fix some race conditions
   - Fix compilation with certain compilers (booke)
   - Fix THP for book3s_hv
   - Fix preemption for book3s_pr
 
 Alexander Graf (4):
       KVM: PPC: Book3S: PR: Don't clobber our exit handler id
       KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu
       KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy
       KVM: PPC: Book3S: PR: Enable interrupts earlier
 
 Aneesh Kumar K.V (1):
       powerpc: book3s: kvm: Don't abuse host r2 in exit path
 
 Paul Mackerras (5):
       KVM: PPC: Book3S HV: Fix physical address calculations
       KVM: PPC: Book3S HV: Refine barriers in guest entry/exit
       KVM: PPC: Book3S HV: Make tbacct_lock irq-safe
       KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call
       KVM: PPC: Book3S HV: Don't drop low-order page address bits
 
 Scott Wood (1):
       powerpc/kvm/booke: Fix build break due to stack frame size warning
 
 pingfan liu (1):
       powerpc: kvm: fix rare but potential deadlock scene
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.19 (GNU/Linux)
 
 iQIcBAABAgAGBQJSscYdAAoJECszeR4D/txgcqAP/1hztHJ+QVwOovEmHSkd6s9G
 A9Ib48U3r/YX5Xugp3VeJQEoSvvRQQDvi1lcu20YO7HRFL3AZBnq2/EgXaMSfu0s
 kKWZiadlpYNkSfjcipuia1yu2auAVWyGTMjuwWhKSH7WJnTrQD17vTNaOhnfrEvY
 wfUTCux7JSUlDnAuNBPHjtWgPsNXZ9U5ODThLVKMuXUceFxse/pRER+RM8/sGwGD
 h5uQicwPAD4bp2epg7zG7NgFs9np1U/WZvwHn3LGlb/eHJW0lB/lqdCFMtBFaDiA
 3GS3AOIJCtWhEPzghUJMyId8Yc7E5Bi27ur+8fOKHddbM+NFR154hTzoOuVZgvmq
 HdNhcjTDfhimKl+aPaQyFpnePBLk2hZ5zEyxr5eMocyvZ+uRL7ghhUBjnNFNXk1k
 FAlzyEWXirdumN2sS9u9/PUhoETL13yhghxXzDq35/rjWxPuLtjvVlmroQfPI5cl
 0AW5d3G5lEnb/vNo/dUFG8EAxunX26sgaro6XxLA3Y/tZ4691S9mNaeyLv/w4VDS
 T9IcLUIhnpkR6HPkXci1mRrX13GC1uBB74jhBJvgJs91UmgLZN3W3VEcS5ulXxxb
 UoLsDSO1qo2Md2KrRltsRcMJAaAjbbcTzApudpN24d6zMCUxxfnjNW9Q8h2+eaoi
 ST9nIxzK3a9HHnnJ6AsJ
 =kveZ
 -----END PGP SIGNATURE-----

Merge tag 'signed-for-3.13' of git://github.com/agraf/linux-2.6 into kvm-master

Patch queue for 3.13 - 2013-12-18

This fixes some grave issues we've only found after 3.13-rc1:

  - Make the modularized HV/PR book3s kvm work well as modules
  - Fix some race conditions
  - Fix compilation with certain compilers (booke)
  - Fix THP for book3s_hv
  - Fix preemption for book3s_pr

Alexander Graf (4):
      KVM: PPC: Book3S: PR: Don't clobber our exit handler id
      KVM: PPC: Book3S: PR: Export kvmppc_copy_to|from_svcpu
      KVM: PPC: Book3S: PR: Make svcpu -> vcpu store preempt savvy
      KVM: PPC: Book3S: PR: Enable interrupts earlier

Aneesh Kumar K.V (1):
      powerpc: book3s: kvm: Don't abuse host r2 in exit path

Paul Mackerras (5):
      KVM: PPC: Book3S HV: Fix physical address calculations
      KVM: PPC: Book3S HV: Refine barriers in guest entry/exit
      KVM: PPC: Book3S HV: Make tbacct_lock irq-safe
      KVM: PPC: Book3S HV: Take SRCU read lock around kvm_read_guest() call
      KVM: PPC: Book3S HV: Don't drop low-order page address bits

Scott Wood (1):
      powerpc/kvm/booke: Fix build break due to stack frame size warning

pingfan liu (1):
      powerpc: kvm: fix rare but potential deadlock scene
2013-12-20 19:13:58 +01:00
Aneesh Kumar K.V
36e7bb3802 powerpc: book3s: kvm: Don't abuse host r2 in exit path
We don't use PACATOC for PR. Avoid updating HOST_R2 with PR
KVM mode when both HV and PR are enabled in the kernel. Without this we
get the below crash

(qemu)
Unable to handle kernel paging request for data at address 0xffffffffffff8310
Faulting instruction address: 0xc00000000001d5a4
cpu 0x2: Vector: 300 (Data Access) at [c0000001dc53aef0]
    pc: c00000000001d5a4: .vtime_delta.isra.1+0x34/0x1d0
    lr: c00000000001d760: .vtime_account_system+0x20/0x60
    sp: c0000001dc53b170
   msr: 8000000000009032
   dar: ffffffffffff8310
 dsisr: 40000000
  current = 0xc0000001d76c62d0
  paca    = 0xc00000000fef1100   softe: 0        irq_happened: 0x01
    pid   = 4472, comm = qemu-system-ppc
enter ? for help
[c0000001dc53b200] c00000000001d760 .vtime_account_system+0x20/0x60
[c0000001dc53b290] c00000000008d050 .kvmppc_handle_exit_pr+0x60/0xa50
[c0000001dc53b340] c00000000008f51c kvm_start_lightweight+0xb4/0xc4
[c0000001dc53b510] c00000000008cdf0 .kvmppc_vcpu_run_pr+0x150/0x2e0
[c0000001dc53b9e0] c00000000008341c .kvmppc_vcpu_run+0x2c/0x40
[c0000001dc53ba50] c000000000080af4 .kvm_arch_vcpu_ioctl_run+0x54/0x1b0
[c0000001dc53bae0] c00000000007b4c8 .kvm_vcpu_ioctl+0x478/0x730
[c0000001dc53bca0] c0000000002140cc .do_vfs_ioctl+0x4ac/0x770
[c0000001dc53bd80] c0000000002143e8 .SyS_ioctl+0x58/0xb0
[c0000001dc53be30] c000000000009e58 syscall_exit+0x0/0x98

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-12-18 11:29:31 +01:00
Ingo Molnar
fe361cfcf4 Linux 3.13-rc4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1.4.15 (GNU/Linux)
 
 iQEcBAABAgAGBQJSrhGrAAoJEHm+PkMAQRiGsNoH/jIK3CsQ2lbW7yRLXmfgtbzz
 i2Kep6D4SDvmaLpLYOVC8xNYTiE8jtTbSXHomwP5wMZ63MQDhBfnEWsEWqeZ9+D9
 3Q46p0QWuoBgYu2VGkoxTfygkT6hhSpwWIi3SeImbY4fg57OHiUil/+YGhORM4Qc
 K4549OCTY3sIrgmWL77gzqjRUo+pQ4C73NKqZ3+5nlOmYBZC1yugk8mFwEpQkwhK
 4NRNU760Fo+XIht/bINqRiPMddzC15p0mxvJy3cDW8bZa1tFSS9SB7AQUULBbcHL
 +2dFlFOEb5SV1sNiNPrJ0W+h2qUh2e7kPB0F8epaBppgbwVdyQoC2u4uuLV2ZN0=
 =lI2r
 -----END PGP SIGNATURE-----

Merge tag 'v3.13-rc4' into perf/core

Merge Linux 3.13-rc4, to refresh this branch with the latest fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2013-12-16 14:51:32 +01:00
Anton Blanchard
a29e30efa3 powerpc: Fix endian issues in crash dump code
A couple more device tree properties that need byte swapping.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-13 15:48:39 +11:00
Anton Blanchard
f8a1883a83 powerpc: Fix topology core_id endian issue on LE builds
cpu_to_core_id() is missing a byteswap:

cat /sys/devices/system/cpu/cpu63/topology/core_id
201326592

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-13 15:48:34 +11:00
Anton Blanchard
01666c8ee2 powerpc: Fix endian issue in setup-common.c
During on LE boot we see:

    Partition configured for 1073741824 cpus, operating system maximum is 2048.

Clearly missing a byteswap here.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-13 15:48:34 +11:00
Ulrich Weigand
36aa1b180e powerpc: PTRACE_PEEKUSR always returns FPR0
There is a bug in using ptrace to access FPRs via PTRACE_PEEKUSR /
PTRACE_POKEUSR. In effect, trying to access any of the FPRs always
really accesses FPR0, which does seriously break debugging :-)

The problem seems to have been introduced by commit 3ad26e5c4459d
(Merge branch 'for-kvm' into next).

[ It is indeed a merge conflict between Paul's FPU/VSX state rework
and my LE patches - Anton ]

Signed-off-by: Ulrich Weigand <Ulrich.Weigand@de.ibm.com>
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-13 15:48:33 +11:00
Scott Wood
f5f972102d powerpc/kvm/booke: Fix build break due to stack frame size warning
Commit ce11e48b7fdd256ec68b932a89b397a790566031 ("KVM: PPC: E500: Add
userspace debug stub support") added "struct thread_struct" to the
stack of kvmppc_vcpu_run().  thread_struct is 1152 bytes on my build,
compared to 48 bytes for the recently-introduced "struct debug_reg".
Use the latter instead.

This fixes the following error:

cc1: warnings being treated as errors
arch/powerpc/kvm/booke.c: In function 'kvmppc_vcpu_run':
arch/powerpc/kvm/booke.c:760:1: error: the frame size of 1424 bytes is larger than 1024 bytes
make[2]: *** [arch/powerpc/kvm/booke.o] Error 1
make[1]: *** [arch/powerpc/kvm] Error 2
make[1]: *** Waiting for unfinished jobs....

Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-12-11 00:12:44 +01:00
Mahesh Salgaonkar
e641eb03ab powerpc: Fix up the kdump base cap to 128M
The current logic sets the kdump base to min of 2G or ppc64_rma_size/2.
On PowerNV kernel the first memory block 'memory@0' can be very large,
equal to the DIMM size with ppc64_rma_size value capped to 1G. Hence on
PowerNV, kdump base is set to 512M resulting kdump to fail while allocating
paca array. This is because, paca need its memory from RMA region capped
at 256M (see allocate_pacas()).

This patch lowers the kdump base cap to 128M so that kdump kernel can
successfully get memory below 256M for paca allocation.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-10 11:28:39 +11:00
Michael Ellerman
2d6f0c3ae6 powerpc: Fix build break with PPC_EARLY_DEBUG_BOOTX=y
A kernel configured with PPC_EARLY_DEBUG_BOOTX=y but PPC_PMAC=n and
PPC_MAPLE=n will fail to link:

  btext.c:(.text+0x2d0fc): undefined reference to `.rmci_off'
  btext.c:(.text+0x2d214): undefined reference to `.rmci_on'

Fix it by making the build of rmci_on/off() depend on
PPC_EARLY_DEBUG_BOOTX, which also enable the only code that uses them.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-10 11:25:03 +11:00
Jeremy Kerr
6f4441ef70 powerpc: Dynamically allocate slb_shadow from memblock
Currently, the slb_shadow buffer is our largest symbol:

  [jk@pablo linux]$ nm --size-sort -r -S obj/vmlinux | head -1
  c000000000da0000 0000000000040000 d slb_shadow

- we allocate 128 bytes per cpu; so 256k with NR_CPUS=2048. As we have
constant initialisers, it's allocated in .text, causing a larger vmlinux
image. We may also allocate unecessary slb_shadow buffers (> no. pacas),
since we use the build-time NR_CPUS rather than the run-time nr_cpu_ids.

We could move this to the bss, but then we still have the NR_CPUS vs
nr_cpu_ids potential for overallocation.

This change dynamically allocates the slb_shadow array, during
initialise_pacas(). At a cost of 104 bytes of text, we save 256k of
data:

  [jk@pablo linux]$ size obj/vmlinux{.orig,}
     text     data      bss       dec     hex	filename
  9202795  5244676  1169576  15617047  ee4c17	obj/vmlinux.orig
  9202899  4982532  1169576  15355007  ea4c7f	obj/vmlinux

Tested on pseries.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-09 11:40:26 +11:00
Jeremy Kerr
1a8f6f97ea powerpc: Make slb_shadow a local
The only external user of slb_shadow is the pseries lpar code, and it
can access through the paca array instead.

Signed-off-by: Jeremy Kerr <jk@ozlabs.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-09 11:40:25 +11:00
Brian King
fb48dc2282 powerpc: Increase EEH recovery timeout for SR-IOV
In order to support concurrent adapter firmware download
to SR-IOV adapters on pSeries, each VF will see an EEH event
where the slot will remain in the unavailable state for
the duration of the adapter firmware update, which can take
as long as 5 minutes. Extend the EEH recovery timeout to
account for this.

Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Acked-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-05 16:08:20 +11:00
Alexey Kardashevskiy
d905c5df9a PPC: POWERNV: move iommu_add_device earlier
The current implementation of IOMMU on sPAPR does not use iommu_ops
and therefore does not call IOMMU API's bus_set_iommu() which
1) sets iommu_ops for a bus
2) registers a bus notifier
Instead, PCI devices are added to IOMMU groups from
subsys_initcall_sync(tce_iommu_init) which does basically the same
thing without using iommu_ops callbacks.

However Freescale PAMU driver (https://lkml.org/lkml/2013/7/1/158)
implements iommu_ops and when tce_iommu_init is called, every PCI device
is already added to some group so there is a conflict.

This patch does 2 things:
1. removes the loop in which PCI devices were added to groups and
adds explicit iommu_add_device() calls to add devices as soon as they get
the iommu_table pointer assigned to them.
2. moves a bus notifier to powernv code in order to avoid conflict with
the notifier from Freescale driver.

iommu_add_device() and iommu_del_device() are public now.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-05 16:08:17 +11:00
Mahesh Salgaonkar
b63a0ffe35 powerpc/powernv: Machine check exception handling.
Add basic error handling in machine check exception handler.

- If MSR_RI isn't set, we can not recover.
- Check if disposition set to OpalMCE_DISPOSITION_RECOVERED.
- Check if address at fault is inside kernel address space, if not then send
  SIGBUS to process if we hit exception when in userspace.
- If address at fault is not provided then and if we get a synchronous machine
  check while in userspace then kill the task.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-05 16:06:06 +11:00
Mahesh Salgaonkar
b5ff4211a8 powerpc/book3s: Queue up and process delayed MCE events.
When machine check real mode handler can not continue into host kernel
in V mode, it returns from the interrupt and we loose MCE event which
never gets logged. In such a situation queue up the MCE event so that
we can log it later when we get back into host kernel with r1 pointing to
kernel stack e.g. during syscall exit.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-05 16:05:21 +11:00
Mahesh Salgaonkar
36df96f8ac powerpc/book3s: Decode and save machine check event.
Now that we handle machine check in linux, the MCE decoding should also
take place in linux host. This info is crucial to log before we go down
in case we can not handle the machine check errors. This patch decodes
and populates a machine check event which contain high level meaning full
MCE information.

We do this in real mode C code with ME bit on. The MCE information is still
available on emergency stack (in pt_regs structure format). Even if we take
another exception at this point the MCE early handler will allocate a new
stack frame on top of current one. So when we return back here we still have
our MCE information safe on current stack.

We use per cpu buffer to save high level MCE information. Each per cpu buffer
is an array of machine check event structure indexed by per cpu counter
mce_nest_count. The mce_nest_count is incremented every time we enter
machine check early handler in real mode to get the current free slot
(index = mce_nest_count - 1). The mce_nest_count is decremented once the
MCE info is consumed by virtual mode machine exception handler.

This patch provides save_mce_event(), get_mce_event() and release_mce_event()
generic routines that can be used by machine check handlers to populate and
retrieve the event. The routine release_mce_event() will free the event slot so
that it can be reused. Caller can invoke get_mce_event() with a release flag
either to release the event slot immediately OR keep it so that it can be
fetched again. The event slot can be also released anytime by invoking
release_mce_event().

This patch also updates kvm code to invoke get_mce_event to retrieve generic
mce event rather than paca->opal_mce_evt.

The KVM code always calls get_mce_event() with release flags set to false so
that event is available for linus host machine

If machine check occurs while we are in guest, KVM tries to handle the error.
If KVM is able to handle MC error successfully, it enters the guest and
delivers the machine check to guest. If KVM is not able to handle MC error, it
exists the guest and passes the control to linux host machine check handler
which then logs MC event and decides how to handle it in linux host. In failure
case, KVM needs to make sure that the MC event is available for linux host to
consume. Hence KVM always calls get_mce_event() with release flags set to false
and later it invokes release_mce_event() only if it succeeds to handle error.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-05 16:05:20 +11:00
Mahesh Salgaonkar
ae744f3432 powerpc/book3s: Flush SLB/TLBs if we get SLB/TLB machine check errors on power8.
This patch handles the memory errors on power8. If we get a machine check
exception due to SLB or TLB errors, then flush SLBs/TLBs and reload SLBs to
recover.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-05 16:04:40 +11:00
Mahesh Salgaonkar
e22a22740c powerpc/book3s: Flush SLB/TLBs if we get SLB/TLB machine check errors on power7.
If we get a machine check exception due to SLB or TLB errors, then flush
SLBs/TLBs and reload SLBs to recover. We do this in real mode before turning
on MMU. Otherwise we would run into nested machine checks.

If we get a machine check when we are in guest, then just flush the
SLBs and continue. This patch handles errors for power7. The next
patch will handle errors for power8

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-05 16:04:39 +11:00
Mahesh Salgaonkar
0440705049 powerpc/book3s: Add flush_tlb operation in cpu_spec.
This patch introduces flush_tlb operation in cpu_spec structure. This will
help us to invoke appropriate CPU-side flush tlb routine. This patch
adds the foundation to invoke CPU specific flush routine for respective
architectures. Currently this patch introduce flush_tlb for p7 and p8.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Acked-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-12-05 16:04:38 +11:00