linux

iv/linux

Author	SHA1	Message	Date
Benjamin Herrenschmidt	30aae739a9	Merge commit 'kumar/kumar-next' into next	2009-01-13 13:59:03 +11:00
Trent Piepho	19f5465e82	powerpc/fsl-booke: Don't hard-code size of struct tlbcam Some assembly code in head_fsl_booke.S hard-coded the size of struct tlbcam to 20 when it indexed the TLBCAM table. Anyone changing the size of struct tlbcam would not know to expect that. The kernel already has a system to get the size of C structures into assembly language files, asm-offsets, so let's use it. The definition of the struct gets moved to a header, so that asm-offsets.c can include it. Signed-off-by: Trent Piepho <tpiepho@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2009-01-07 15:33:06 -06:00
Hollis Blanchard	73e75b416f	KVM: ppc: Implement in-kernel exit timing statistics Existing KVM statistics are either just counters (kvm_stat) reported for KVM generally or trace based aproaches like kvm_trace. For KVM on powerpc we had the need to track the timings of the different exit types. While this could be achieved parsing data created with a kvm_trace extension this adds too much overhead (at least on embedded PowerPC) slowing down the workloads we wanted to measure. Therefore this patch adds a in-kernel exit timing statistic to the powerpc kvm code. These statistic is available per vm&vcpu under the kvm debugfs directory. As this statistic is low, but still some overhead it can be enabled via a .config entry and should be off by default. Since this patch touched all powerpc kvm_stat code anyway this code is now merged and simplified together with the exit timing statistic code (still working with exit timing disabled in .config). Signed-off-by: Christian Ehrhardt <ehrhardt@linux.vnet.ibm.com> Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:41 +02:00
Hollis Blanchard	7924bd4109	KVM: ppc: directly insert shadow mappings into the hardware TLB Formerly, we used to maintain a per-vcpu shadow TLB and on every entry to the guest would load this array into the hardware TLB. This consumed 1280 bytes of memory (64 entries of 16 bytes plus a struct page pointer each), and also required some assembly to loop over the array on every entry. Instead of saving a copy in memory, we can just store shadow mappings directly into the hardware TLB, accepting that the host kernel will clobber these as part of the normal 440 TLB round robin. When we do that we need less than half the memory, and we have decreased the exit handling time for all guest exits, at the cost of increased number of TLB misses because the host overwrites some guest entries. These savings will be increased on processors with larger TLBs or which implement intelligent flush instructions like tlbivax (which will avoid the need to walk arrays in software). In addition to that and to the code simplification, we have a greater chance of leaving other host userspace mappings in the TLB, instead of forcing all subsequent tasks to re-fault all their mappings. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:55:09 +02:00
Hollis Blanchard	db93f5745d	KVM: ppc: create struct kvm_vcpu_44x and introduce container_of() accessor This patch doesn't yet move all 44x-specific data into the new structure, but is the first step down that path. In the future we may also want to create a struct kvm_vcpu_booke. Based on patch from Liu Yu <yu.liu@freescale.com>. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:52:22 +02:00
Hollis Blanchard	0f55dc481e	KVM: ppc: Rename "struct tlbe" to "struct kvmppc_44x_tlbe" This will ease ports to other cores. Also remove unused "struct kvm_tlb" while we're at it. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@redhat.com>	2008-12-31 16:51:50 +02:00
Ilya Yanok	ca9153a3a2	powerpc/44x: Support 16K/64K base page sizes on 44x This adds support for 16k and 64k page sizes on PowerPC 44x processors. The PGDIR table is much smaller than a page when using 16k or 64k pages (512 and 32 bytes respectively) so we allocate the PGDIR with kzalloc() instead of __get_free_pages(). One PTE table covers rather a large memory area when using 16k or 64k pages (32MB or 512MB respectively), so we can easily put FIXMAP and PKMAP in the area covered by one PTE table. Signed-off-by: Yuri Tikhonov <yur@emcraft.com> Signed-off-by: Vladimir Panfilov <pvr@emcraft.com> Signed-off-by: Ilya Yanok <yanok@emcraft.com> Acked-by: Josh Boyer <jwboyer@linux.vnet.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-29 09:53:25 +11:00
Benjamin Herrenschmidt	5e696617c4	powerpc/mm: Split mmu_context handling This splits the mmu_context handling between 32-bit hash based processors, 64-bit hash based processors and everybody else. This is preliminary work for adding SMP support for BookE processors. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Acked-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-12-21 14:21:15 +11:00
Paul Mackerras	597bc5c00b	powerpc: Improve resolution of VDSO clock_gettime Currently the clock_gettime implementation in the VDSO produces a result with microsecond resolution for the cases that are handled without a system call, i.e. CLOCK_REALTIME and CLOCK_MONOTONIC. The nanoseconds field of the result is obtained by computing a microseconds value and multiplying by 1000. This changes the code in the VDSO to do the computation for clock_gettime with nanosecond resolution. That means that the resolution of the result will ultimately depend on the timebase frequency. Because the timestamp in the VDSO datapage (stamp_xsec, the real time corresponding to the timebase count in tb_orig_stamp) is in units of 2^-20 seconds, it doesn't have sufficient resolution for computing a result with nanosecond resolution. Therefore this adds a copy of xtime to the VDSO datapage and updates it in update_gtod() along with the other time-related fields. Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-11-06 09:49:22 +11:00
Linus Torvalds	08d19f51f0	Merge branch 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm * 'kvm-updates/2.6.28' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (134 commits) KVM: ia64: Add intel iommu support for guests. KVM: ia64: add directed mmio range support for kvm guests KVM: ia64: Make pmt table be able to hold physical mmio entries. KVM: Move irqchip_in_kernel() from ioapic.h to irq.h KVM: Separate irq ack notification out of arch/x86/kvm/irq.c KVM: Change is_mmio_pfn to kvm_is_mmio_pfn, and make it common for all archs KVM: Move device assignment logic to common code KVM: Device Assignment: Move vtd.c from arch/x86/kvm/ to virt/kvm/ KVM: VMX: enable invlpg exiting if EPT is disabled KVM: x86: Silence various LAPIC-related host kernel messages KVM: Device Assignment: Map mmio pages into VT-d page table KVM: PIC: enhance IPI avoidance KVM: MMU: add "oos_shadow" parameter to disable oos KVM: MMU: speed up mmu_unsync_walk KVM: MMU: out of sync shadow core KVM: MMU: mmu_convert_notrap helper KVM: MMU: awareness of new kvm_mmu_zap_page behaviour KVM: MMU: mmu_parent_walk KVM: x86: trap invlpg KVM: MMU: sync roots on mmu reload ...	2008-10-16 15:36:00 -07:00
Hollis Blanchard	49dd2c4928	KVM: powerpc: Map guest userspace with TID=0 mappings When we use TID=N userspace mappings, we must ensure that kernel mappings have been destroyed when entering userspace. Using TID=1/TID=0 for kernel/user mappings and running userspace with PID=0 means that userspace can't access the kernel mappings, but the kernel can directly access userspace. The net is that we don't need to flush the TLB on privilege switches, but we do on guest context switches (which are far more infrequent). Guest boot time performance improvement: about 30%. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:16 +02:00
Hollis Blanchard	83aae4a809	KVM: ppc: Write only modified shadow entries into the TLB on exit Track which TLB entries need to be written, instead of overwriting everything below the high water mark. Typically only a single guest TLB entry will be modified in a single exit. Guest boot time performance improvement: about 15%. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:16 +02:00
Hollis Blanchard	20754c2495	KVM: ppc: Stop saving host TLB state We're saving the host TLB state to memory on every exit, but never using it. Originally I had thought that we'd want to restore host TLB for heavyweight exits, but that could actually hurt when context switching to an unrelated host process (i.e. not qemu). Since this decreases the performance penalty of all exits, this patch improves guest boot time by about 15%. Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-10-15 10:15:16 +02:00
Becky Bruce	4ee7084eb1	POWERPC: Allow 32-bit hashed pgtable code to support 36-bit physical This rearranges a bit of code, and adds support for 36-bit physical addressing for configs that use a hashed page table. The 36b physical support is not enabled by default on any config - it must be explicitly enabled via the config system. This patch only expands the page table code to accomodate large physical addresses on 32-bit systems and enables the PHYS_64BIT config option for 86xx. It does not allow you to boot a board with more than about 3.5GB of RAM - for that, SWIOTLB support is also required (and coming soon). Signed-off-by: Becky Bruce <becky.bruce@freescale.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2008-09-24 16:29:44 -05:00
Paul Mackerras	1f6a93e4c3	powerpc: Make it possible to move the interrupt handlers away from the kernel This changes the way that the exception prologs transfer control to the handlers in 64-bit kernels with the aim of making it possible to have the prologs separate from the main body of the kernel. Now, instead of computing the address of the handler by taking the top 32 bits of the paca address (to get the 0xc0000000........ part) and ORing in something in the bottom 16 bits, we get the base address of the kernel by doing a load from the paca and add an offset. This also replaces an mfmsr and an ori to compute the MSR value for the handler with a load from the paca. That makes it unnecessary to have a separate version of EXCEPTION_PROLOG_PSERIES that forces 64-bit mode. We can no longer use a direct branches in the exception prolog code, which means that the SLB miss handlers can't branch directly to .slb_miss_realmode any more. Instead we have to compute the address and do an indirect branch. This is conditional on CONFIG_RELOCATABLE; for non-relocatable kernels we use a direct branch as before. (A later change will allow CONFIG_RELOCATABLE to be set on 64-bit powerpc.) Since the secondary CPUs on pSeries start execution in the first 0x100 bytes of real memory and then have to get to wherever the kernel is, we can't use a direct branch to get there. Instead this changes __secondary_hold_spinloop from a flag to a function pointer. When it is set to a non-NULL value, the secondary CPUs jump to the function pointed to by that value. Finally this eliminates one code difference between 32-bit and 64-bit by making __secondary_hold be the text address of the secondary CPU spinloop rather than a function descriptor for it. Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-09-15 11:08:08 -07:00
Michael Neuling	c6e6771b87	powerpc: Introduce VSX thread_struct and CONFIG_VSX The layout of the new VSR registers and how they overlap on top of the legacy FPR and VR registers is: VSR doubleword 0 VSR doubleword 1 ---------------------------------------------------------------- VSR[0] \| FPR[0] \| \| ---------------------------------------------------------------- VSR[1] \| FPR[1] \| \| ---------------------------------------------------------------- \| ... \| \| \| ... \| \| ---------------------------------------------------------------- VSR[30] \| FPR[30] \| \| ---------------------------------------------------------------- VSR[31] \| FPR[31] \| \| ---------------------------------------------------------------- VSR[32] \| VR[0] \| ---------------------------------------------------------------- VSR[33] \| VR[1] \| ---------------------------------------------------------------- \| ... \| \| ... \| ---------------------------------------------------------------- VSR[62] \| VR[30] \| ---------------------------------------------------------------- VSR[63] \| VR[31] \| ---------------------------------------------------------------- VSX has 64 128bit registers. The first 32 regs overlap with the FP registers and hence extend them with and additional 64 bits. The second 32 regs overlap with the VMX registers. This commit introduces the thread_struct changes required to reflect this register layout. Ptrace and signals code is updated so that the floating point registers are correctly accessed from the thread_struct when CONFIG_VSX is enabled. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-07-01 11:28:46 +10:00
Kumar Gala	fca622c5b2	[POWERPC] 40x/Book-E: Save/restore volatile exception registers On machines with more than one exception level any system register that might be modified by the "normal" exception level needs to be saved and restored on taking a higher level exception. We already are saving and restoring ESR and DEAR. For critical level add SRR0/1. For debug level add CSRR0/1 and SRR0/1. For machine check level add DSRR0/1, CSRR0/1, and SRR0/1. On FSL Book-E parts we always save/restore the MAS registers for critical, debug, and machine check level exceptions. On 44x we always save/restore the MMUCR. Additionally, we save and restore the ksp_limit since we have to adjust it for each exception level. Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Acked-by: Paul Mackerras <paulus@samba.org>	2008-06-02 14:56:35 -05:00
Linus Torvalds	867a89e0b7	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [RAPIDIO] Change RapidIO doorbell source and target ID field to 16-bit [RAPIDIO] Add RapidIO connection info print out and re-training for broken connections [RAPIDIO] Add serial RapidIO controller support, which includes MPC8548, MPC8641 [RAPIDIO] Add RapidIO node probing into MPC86xx_HPCN board id table [RAPIDIO] Add RapidIO node into MPC8641HPCN dts file [RAPIDIO] Auto-probe the RapidIO system size [RAPIDIO] Add OF-tree support to RapidIO controller driver [RAPIDIO] Add RapidIO multi mport support [RAPIDIO] Move include/asm-ppc/rio.h to asm-powerpc [RAPIDIO] Add RapidIO option to kernel configuration [RAPIDIO] Change RIO function mpc85xx_ to fsl_ [POWERPC] Provide walk_memory_resource() for powerpc [POWERPC] Update lmb data structures for hotplug memory add/remove [POWERPC] Hotplug memory remove notifications for powerpc [POWERPC] windfarm: Add PowerMac 12,1 support [POWERPC] Fix building of pmac32 when CONFIG_NVRAM=m [POWERPC] Add IRQSTACKS support on ppc32 [POWERPC] Use __always_inline for xchg* and cmpxchg* [POWERPC] Add fast little-endian switch system call	2008-04-29 08:19:14 -07:00
Christoph Lameter	d4d298feea	ppc/powerpc: use kbuild.h instead of defining macros in asm-offsets.c Signed-off-by: Christoph Lameter <clameter@sgi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-04-29 08:06:30 -07:00
Kumar Gala	85218827cc	[POWERPC] Add IRQSTACKS support on ppc32 This makes it possible to use separate stacks for hard and soft IRQs on 32-bit powerpc as well as on 64-bit. The code for 32-bit is just the 32-bit analog of the 64-bit code. * Added allocation and initialization of the irq stacks. We limit the stacks to be in lowmem for ppc32. * Implemented ppc32 versions of call_do_softirq() and call_handle_irq() to switch the stack pointers * Reworked how we do stack overflow detection. We now keep around the limit of the stack in the thread_struct and compare against the limit to see if we've overflowed. We can now use this on ppc64 if desired. [ paulus@samba.org: Fixed bug on 6xx where we need to reload r9 with the thread_info pointer. ] Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-29 15:57:34 +10:00
Hollis Blanchard	bbf45ba57e	KVM: ppc: PowerPC 440 KVM implementation This functionality is definitely experimental, but is capable of running unmodified PowerPC 440 Linux kernels as guests on a PowerPC 440 host. (Only tested with 440EP "Bamboo" guests so far, but with appropriate userspace support other SoC/board combinations should work.) See Documentation/powerpc/kvm_440.txt for technical details. [stephen: build fix] Signed-off-by: Hollis Blanchard <hollisb@us.ibm.com> Acked-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Avi Kivity <avi@qumranet.com>	2008-04-27 18:21:39 +03:00
Kumar Gala	91120cc8e0	[POWERPC] Cleanup asm-offsets.c * Removed TI_EXECDOMAIN define as its not used anywhere * Use STACK_INT_FRAME_SIZE to allow common define of INT_FRAME_SIZE * Define TI_CPU on both ppc32 & ppc64 (removes an ifdef). Signed-off-by: Kumar Gala <galak@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-24 20:58:02 +10:00
Stephen Rothwell	3eb9cf0761	[POWERPC] iSeries: Use alternate paca structure for booting The iSeries HV only needs the first two fields of the paca statically initialised, so create an alternate paca that contains only those and switch to our real paca immediately after boot. This is in order to make the 1024 cpu patches easier since they will no longer have to statically initialise the pacas for iSeries. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-04-15 21:21:25 +10:00
Roland McGrath	163dab39b5	[POWERPC] powerpc32: Remove asm-offsets ptrace cruft These items in asm-offsets.c are not used anywhere. This removes them. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-03-26 08:44:05 +11:00
Tony Breeds	151db1fc23	Fix compilation of powerpc asm-offsets.c with old gcc Commit `ad7f71674a` ("[POWERPC] Use a sensible default for clock_getres() in the VDSO") corrected the clock resolution reported by the VDSO clock_getres() but introduced another problem in that older versions of gcc (gcc-4.0 and earlier) fail to compile the new code in arch/powerpc/kernel/asm-offsets.c. This fixes it by introducing a new MONOTONIC_RES_NSEC define in the generic code which is equivalent to KTIME_MONOTONIC_RES but is just an integer constant, not a ktime union. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2008-02-07 14:54:45 -08:00
Tony Breeds	ad7f71674a	[POWERPC] Use a sensible default for clock_getres() in the VDSO This ensures that the syscall and the (fast) vdso versions of clock_getres() will return the same resolution. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2008-02-06 16:30:00 +11:00
Kumar Gala	bee86f14d5	[POWERPC] Fix swapper_pg_dir size when CONFIG_PTE_64BIT=y on FSL_BOOKE The size of swapper_pg_dir is 8k instead of 4k when using 64-bit PTEs (CONFIG_PTE_64BIT). This was reported by Cedric Hombourger <chombourger@gmail.com> Signed-off-by: Kumar Gala <galak@kernel.crashing.org>	2007-12-06 13:11:04 -06:00
Olof Johansson	fbe481756d	[POWERPC] vdso: Fixes for cache block sizes The current VDSO implementation is hardcoded to 128 byte cache blocks, which are only used on IBM's 64-bit processors. Convert it to get the cache block sizes out of vdso_data instead, similar to how the ppc64 in-kernel cache flush does it. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-11-20 13:56:31 +11:00
Michael Neuling	4603ac180a	powerpc: add scaled time accounting This adds POWERPC specific hooks for scaled time accounting. POWER6 includes a SPURR register. The SPURR is based off the PURR register but is scaled based on CPU frequency and issue rates. This gives a more accurate account of the instructions used per task. The PURR and timebase will be constant relative to the wall clock, irrespective of the CPU frequency. This implementation reads the SPURR register in account_system_vtime which is only call called on context witch and hard and soft irq entry and exit. The percentage of user and system time is then estimated using the ratio of these accounted by the PURR. If the SPURR is not present, the PURR read. An earlier implementation of this patch read the SPURR whenever the PURR was read, which included the system call entry and exit path. Unfortunately this showed a performance regression on lmbench runs, so was re-implemented. I've included the lmbench results here when run bare metal on POWER6. 1st column is the unpatch results. 2nd column is the results using the below patch and the 3rd is the % diff of these results from the base. 4th and 5th columns are the results and % differnce from the base using the older patch (SPURR read in syscall entry/exit path). Base Scaled-Acct SPURR-in-syscall Result Result % diff Result % diff Simple syscall: 0.3086 0.3086 0.0000 0.3452 11.8600 Simple read: 0.4591 0.4671 1.7425 0.5044 9.86713 Simple write: 0.4364 0.4366 0.0458 0.4731 8.40971 Simple stat: 2.0055 2.0295 1.1967 2.0669 3.06158 Simple fstat: 0.5962 0.5876 -1.442 0.6368 6.80979 Simple open/close: 3.1283 3.1009 -0.875 3.2088 2.57328 Select on 10 fd's: 0.8554 0.8457 -1.133 0.8667 1.32101 Select on 100 fd's: 3.5292 3.6329 2.9383 3.6664 3.88756 Select on 250 fd's: 7.9097 8.1881 3.5197 8.2242 3.97613 Select on 500 fd's: 15.2659 15.836 3.7357 15.873 3.97814 Select on 10 tcp fd's: 0.9576 0.9416 -1.670 0.9752 1.83792 Select on 100 tcp fd's: 7.248 7.2254 -0.311 7.2685 0.28283 Select on 250 tcp fd's: 17.7742 17.707 -0.375 17.749 -0.1406 Select on 500 tcp fd's: 35.4258 35.25 -0.496 35.286 -0.3929 Signal handler installation: 0.6131 0.6075 -0.913 0.647 5.52927 Signal handler overhead: 2.0919 2.1078 0.7600 2.1831 4.35967 Protection fault: 0.7345 0.7478 1.8107 0.8031 9.33968 Pipe latency: 33.006 16.398 -50.31 33.475 1.42368 AF_UNIX sock stream latency: 14.5093 30.910 113.03 30.715 111.692 Process fork+exit: 219.8 222.8 1.3648 229.37 4.35623 Process fork+execve: 876.14 873.28 -0.32 868.66 -0.8533 Process fork+/bin/sh -c: 2830 2876.5 1.6431 2958 4.52296 File /var/tmp/XXX write bw: 1193497 1195536 0.1708 118657 -0.5799 Pagefaults on /var/tmp/XXX: 3.1272 3.2117 2.7020 3.2521 3.99398 Also, kernel compile times show no difference with this patch applied. [pbadari@us.ibm.com: Avoid unnecessary PURR reading] Signed-off-by: Michael Neuling <mikey@neuling.org> Cc: Balbir Singh <balbir@in.ibm.com> Cc: Jay Lan <jlan@engr.sgi.com> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Badari Pulavarty <pbadari@us.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-10-18 14:37:28 -07:00
Stephen Rothwell	ee7a76da1e	[POWERPC] Size swapper_pg_dir correctly David Gibson pointed out that swapper_pg_dir actually need to be PGD_TABLE_SIZE bytes long not PAGE_SIZE. This actually saves 64k in the bss for a kernel ppc64_defconfig built with CONFIG_PPC_64K_PAGES. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-09-19 15:25:34 +10:00
Stephen Rothwell	16a15a30f8	[POWERPC] iSeries: Clean up lparmap mess We need to have xLparMap in head_64.S so that it is at a fixed address (because the linker will not resolve (address & 0xffffffff) for us). But the assembler miscalculates the KERNEL_VSID() expressions. So put the confusing expressions into asm-offsets.c. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-08-22 15:21:46 +10:00
Linus Torvalds	aabded9c3a	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc * 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/paulus/powerpc: [POWERPC] Further fixes for the removal of 4level-fixup hack from ppc32 [POWERPC] EEH: log all PCI-X and PCI-E AER registers [POWERPC] EEH: capture and log pci state on error [POWERPC] EEH: Split up long error msg [POWERPC] EEH: log error only after driver notification. [POWERPC] fsl_soc: Make mac_addr const in fs_enet_of_init(). [POWERPC] Don't use SLAB/SLUB for PTE pages [POWERPC] Spufs support for 64K LS mappings on 4K kernels [POWERPC] Add ability to 4K kernel to hash in 64K pages [POWERPC] Introduce address space "slices" [POWERPC] Small fixes & cleanups in segment page size demotion [POWERPC] iSeries: Make HVC_ISERIES the default [POWERPC] iSeries: suppress build warning in lparmap.c [POWERPC] Mark pages that don't exist as nosave [POWERPC] swsusp: Introduce register_nosave_region_late	2007-05-09 12:56:01 -07:00
Roman Zippel	f7e4217b00	rename thread_info to stack This finally renames the thread_info field in task structure to stack, so that the assumptions about this field are gone and archs have more freedom about placing the thread_info structure. Nonbroken archs which have a proper thread pointer can do the access to both current thread and task structure via a single pointer. It'll allow for a few more cleanups of the fork code, from which e.g. ia64 could benefit. Signed-off-by: Roman Zippel <zippel@linux-m68k.org> [akpm@linux-foundation.org: build fix] Cc: Richard Henderson <rth@twiddle.net> Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru> Cc: Russell King <rmk@arm.linux.org.uk> Cc: Ian Molton <spyro@f2s.com> Cc: Haavard Skinnemoen <hskinnemoen@atmel.com> Cc: Mikael Starvik <starvik@axis.com> Cc: David Howells <dhowells@redhat.com> Cc: Yoshinori Sato <ysato@users.sourceforge.jp> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Hirokazu Takata <takata@linux-m32r.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Roman Zippel <zippel@linux-m68k.org> Cc: Greg Ungerer <gerg@uclinux.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Ralf Baechle <ralf@linux-mips.org> Cc: Paul Mackerras <paulus@samba.org> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: Paul Mundt <lethal@linux-sh.org> Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp> Cc: Richard Curnow <rc@rc0.org.uk> Cc: William Lee Irwin III <wli@holomorphy.com> Cc: "David S. Miller" <davem@davemloft.net> Cc: Jeff Dike <jdike@addtoit.com> Cc: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Cc: Miles Bader <uclinux-v850@lsi.nec.co.jp> Cc: Andi Kleen <ak@muc.de> Cc: Chris Zankel <chris@zankel.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2007-05-09 12:30:56 -07:00
Benjamin Herrenschmidt	d0f13e3c20	[POWERPC] Introduce address space "slices" The basic issue is to be able to do what hugetlbfs does but with different page sizes for some other special filesystems; more specifically, my need is: - Huge pages - SPE local store mappings using 64K pages on a 4K base page size kernel on Cell - Some special 4K segments in 64K-page kernels for mapping a dodgy type of powerpc-specific infiniband hardware that requires 4K MMU mappings for various reasons I won't explain here. The main issues are: - To maintain/keep track of the page size per "segment" (as we can only have one page size per segment on powerpc, which are 256MB divisions of the address space). - To make sure special mappings stay within their allotted "segments" (including MAP_FIXED crap) - To make sure everybody else doesn't mmap/brk/grow_stack into a "segment" that is used for a special mapping Some of the necessary mechanisms to handle that were present in the hugetlbfs code, but mostly in ways not suitable for anything else. The patch relies on some changes to the generic get_unmapped_area() that just got merged. It still hijacks hugetlb callbacks here or there as the generic code hasn't been entirely cleaned up yet but that shouldn't be a problem. So what is a slice ? Well, I re-used the mechanism used formerly by our hugetlbfs implementation which divides the address space in "meta-segments" which I called "slices". The division is done using 256MB slices below 4G, and 1T slices above. Thus the address space is divided currently into 16 "low" slices and 16 "high" slices. (Special case: high slice 0 is the area between 4G and 1T). Doing so simplifies significantly the tracking of segments and avoids having to keep track of all the 256MB segments in the address space. While I used the "concepts" of hugetlbfs, I mostly re-implemented everything in a more generic way and "ported" hugetlbfs to it. Slices can have an associated page size, which is encoded in the mmu context and used by the SLB miss handler to set the segment sizes. The hash code currently doesn't care, it has a specific check for hugepages, though I might add a mechanism to provide per-slice hash mapping functions in the future. The slice code provide a pair of "generic" get_unmapped_area() (bottomup and topdown) functions that should work with any slice size. There is some trickiness here so I would appreciate people to have a look at the implementation of these and let me know if I got something wrong. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-09 16:35:00 +10:00
Johannes Berg	543b9fd352	[POWERPC] powermac: Suspend to disk on G5 Powermac G5 suspend to disk implementation. The code is platform agnostic but only tested on powermac, no other 64-bit powerpc machines. Because nvidiafb still breaks suspend I have marked it EXPERIMENTAL on powermac and because I can't test it and some lowlevel code will need changes it is BROKEN on all other 64-bit platforms. Signed-off-by: Johannes Berg <johannes@sipsolutions.net> Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-05-07 20:31:14 +10:00
Olof Johansson	687304014f	[POWERPC] Save trap number in bad_stack Save the trap number in the case of getting a bad stack in an exception handler. It is sometimes useful to know what exception it was that caused this to happen. Without this, no trap number is reported. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-04-24 22:06:59 +10:00
Anton Blanchard	4002aca771	[POWERPC] Remove last_syscall Remove last_syscall from 32bit powerpc, its been gone in 64bit for years. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-03-22 22:52:58 +11:00
David Woodhouse	007d88d042	[POWERPC] Fix manual assembly WARN_ON() in enter_rtas(). When we switched over to the generic BUG mechanism we forgot to change the assembly code which open-codes a WARN_ON() in enter_rtas(), so the bug table got corrupted. This patch provides an EMIT_BUG_ENTRY macro for use in assembly code, and uses it in entry_64.S. Tested with CONFIG_DEBUG_BUGVERBOSE on ppc64 but not without -- I tried to turn it off but it wouldn't go away; I suspect Aunt Tillie probably needed it. This version gets __FILE__ and __LINE__ right in the assembly version -- rather than saying include/asm-powerpc/bug.h line 21 every time which is a little suboptimal. Signed-off-by: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2007-01-09 17:03:02 +11:00
Paul Mackerras	d04c56f73c	[POWERPC] Lazy interrupt disabling for 64-bit machines This implements a lazy strategy for disabling interrupts. This means that local_irq_disable() et al. just clear the 'interrupts are enabled' flag in the paca. If an interrupt comes along, the interrupt entry code notices that interrupts are supposed to be disabled, and clears the EE bit in SRR1, clears the 'interrupts are hard-enabled' flag in the paca, and returns. This means that interrupts only actually get disabled in the processor when an interrupt comes along. When interrupts are enabled by local_irq_enable() et al., the code sets the interrupts-enabled flag in the paca, and then checks whether interrupts got hard-disabled. If so, it also sets the EE bit in the MSR to hard-enable the interrupts. This has the potential to improve performance, and also makes it easier to make a kernel that can boot on iSeries and on other 64-bit machines, since this lazy-disable strategy is very similar to the soft-disable strategy that iSeries already uses. This version renames paca->proc_enabled to paca->soft_enabled, and changes a couple of soft-disables in the kexec code to hard-disables, which should fix the crash that Michael Ellerman saw. This doesn't yet use a reserved CR field for the soft_enabled and hard_enabled flags. This applies on top of Stephen Rothwell's patches to make it possible to build a combined iSeries/other kernel. Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-10-16 16:31:36 +10:00
Olof Johansson	f04da0bc36	[POWERPC] Fix non-smp build This fixes a compile error that only surfaces on CONFIG_SMP=n builds; <asm/hvcall.h> seems to get pulled in through another header file for SMP builds. This problem was introduced by the hvcall stats patch. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-09-14 10:36:11 +10:00
Mike Kravetz	57852a853b	[POWERPC] powerpc: Instrument Hypervisor Calls Add instrumentation for hypervisor calls on pseries. Call statistics include number of calls, wall time and cpu cycles (if available) and are made available via debugfs. Instrumentation code is behind the HCALL_STATS config option and has no impact if not enabled. Signed-off-by: Mike Kravetz <kravetz@us.ibm.com> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-09-13 18:39:53 +10:00
Olof Johansson	f39b7a55a8	[POWERPC] Cleanup CPU inits Cleanup CPU inits a bit more, Geoff Levand already did some earlier. * Move CPU state save to cpu_setup, since cpu_setup is only ever done on cpu 0 on 64-bit and save is never done more than once. * Rename __restore_cpu_setup to __restore_cpu_ppc970 and add function pointers to the cputable to use instead. Powermac always has 970 so no need to check there. * Rename __970_cpu_preinit to __cpu_preinit_ppc970 and check PVR before calling it instead of in it, it's too early to use cputable. * Rename pSeries_secondary_smp_init to generic_secondary_smp_init since everyone but powermac and iSeries use it. Signed-off-by: Olof Johansson <olof@lixom.net> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-08-25 13:27:35 +10:00
Michael Neuling	11a27ad782	[POWERPC] SLB shadow buffer cleanup Cleanup some of the #define magic as suggested by Milton. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-08-25 13:17:08 +10:00
Michael Neuling	2f6093c847	[POWERPC] Implement SLB shadow buffer This adds a shadow buffer for the SLBs and regsiters it with PHYP. Only the bolted SLB entries (top 3) are shadowed. The SLB shadow buffer tells the hypervisor what the kernel needs to have in the SLB for the kernel to be able to function. The hypervisor can use this information to speed up partition context switches. Signed-off-by: Michael Neuling <mikey@neuling.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-08-08 17:08:56 +10:00
Stephen Rothwell	54f5cd8afa	[POWERPC] iseries: Remove unnecessary include of iseries/hv_lp_event.h Also remove unnecessary reference to struct HvLpEvent. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>	2006-07-13 18:56:56 +10:00
Jörn Engel	6ab3d5624e	Remove obsolete #include <linux/config.h> Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>	2006-06-30 19:25:36 +02:00
Paul Mackerras	bf72aeba2f	powerpc: Use 64k pages without needing cache-inhibited large pages Some POWER5+ machines can do 64k hardware pages for normal memory but not for cache-inhibited pages. This patch lets us use 64k hardware pages for most user processes on such machines (assuming the kernel has been configured with CONFIG_PPC_64K_PAGES=y). User processes start out using 64k pages and get switched to 4k pages if they use any non-cacheable mappings. With this, we use 64k pages for the vmalloc region and 4k pages for the imalloc region. If anything creates a non-cacheable mapping in the vmalloc region, the vmalloc region will get switched to 4k pages. I don't know of any driver other than the DRM that would do this, though, and these machines don't have AGP. When a region gets switched from 64k pages to 4k pages, we do not have to clear out all the 64k HPTEs from the hash table immediately. We use the _PAGE_COMBO bit in the Linux PTE to indicate whether the page was hashed in as a 64k page or a set of 4k pages. If hash_page is trying to insert a 4k page for a Linux PTE and it sees that it has already been inserted as a 64k page, it first invalidates the 64k HPTE before inserting the 4k HPTE. The hash invalidation routines also use the _PAGE_COMBO bit, to determine whether to look for a 64k HPTE or a set of 4k HPTEs to remove. With those two changes, we can tolerate a mix of 4k and 64k HPTEs in the hash table, and they will all get removed when the address space is torn down. Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-06-15 10:45:18 +10:00
Paul Mackerras	4306443128	powerpc: Remove unused paca->pgdir field The pgdir field in the paca was a leftover from the dynamic VSIDs patch, and is not used in the current kernel code. This removes it. Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-06-12 18:38:21 +10:00
Paul Mackerras	f39224a8c1	powerpc: Use correct sequence for putting CPU into nap mode We weren't using the recommended sequence for putting the CPU into nap mode. When I changed the idle loop, for some reason 7447A cpus started hanging when we put them into nap mode. Changing to the recommended sequence fixes that. The complexity here is that the recommended sequence is a loop that keeps putting the cpu back into nap mode. Clearly we need some way to break out of the loop when an interrupt (external interrupt, decrementer, performance monitor) occurs. Here we use a bit in the thread_info struct to indicate that we need this, and the exception entry code notices this and arranges for the exception to return to the value in the link register, thus breaking out of the loop. We use a new `local_flags' field in the thread_info which we can alter without needing to use an atomic update sequence. The PPC970 has the same recommended sequence, so we do the same thing there too. This also fixes a bug in the kernel stack overflow handling code on 32-bit, since it was causing a value that we needed in a register to get trashed. Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-04-18 21:49:11 +10:00
Benjamin Herrenschmidt	e8222502ee	[PATCH] powerpc: Kill _machine and hard-coded platform numbers This removes statically assigned platform numbers and reworks the powerpc platform probe code to use a better mechanism. With this, board support files can simply declare a new machine type with a macro, and implement a probe() function that uses the flattened device-tree to detect if they apply for a given machine. We now have a machine_is() macro that replaces the comparisons of _machine with the various PLATFORM_* constants. This commit also changes various drivers to use the new macro instead of looking at _machine. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Paul Mackerras <paulus@samba.org>	2006-03-28 23:15:54 +11:00

1 2

68 Commits