33438 Commits

Author SHA1 Message Date
Andrew Hastings
801916c1b3 x86: fix off-by-one in find_next_zero_string
Fix an off-by-one error in find_next_zero_string which prevents
allocating the last bit.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Andrew Hastings <abh@cray.com> on behalf of Cray Inc.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:22 +02:00
Laurent Vivier
6442eea937 i386: export i386 smp_call_function_mask() to modules
This patch export i386 smp_call_function_mask() with EXPORT_SYMBOL().

This function is needed by KVM to call a function on a set of CPUs.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Laurent Vivier <Laurent.Vivier@bull.net>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:21 +02:00
Roland McGrath
f79eb83b3a x86: Install unstripped copy of 64bit vdso to disk
This keeps an unstripped copy of the 64bit vDSO images built before they are
stripped and embedded in the kernel.  The unstripped copies get installed
in $(MODLIB)/vdso/ by "make install" (or you can explicitly use the
subtarget "make vdso_install").  These files can be useful when they
contain source-level debugging information.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:20 +02:00
Roland McGrath
af7e6a7464 x86_64: install unstripped copies of compat vdso on disk
This keeps an unstripped copy of the vDSO images built before they are
stripped and embedded in the kernel.  The unstripped copies get installed
in $(MODLIB)/vdso/ by "make install" (or you can explicitly use the
subtarget "make vdso_install").  These files can be useful when they
contain source-level debugging information.

[ tglx: arch/x86 adaptation ]

Signed-off-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:18 +02:00
Adrian Bunk
8957ecab02 i386: setup_trampoline() must be __cpuinit
WARNING: arch/i386/kernel/built-in.o(.text+0xf201): Section mismatch: reference to .init.data:trampoline_end (between 'setup_trampoline' and 'cpu_coregroup_map')
WARNING: arch/i386/kernel/built-in.o(.text+0xf207): Section mismatch: reference to .init.data:trampoline_data (between 'setup_trampoline' and 'cpu_coregroup_map')
WARNING: arch/i386/kernel/built-in.o(.text+0xf21a): Section mismatch: reference to .init.data:trampoline_data (between 'setup_trampoline' and 'cpu_coregroup_map')

Harmless but annoying warnings present when building an i386 SMP kernel
with CONFIG_HOTPLUG_CPU=n and gcc < 4.0 .

[ tglx: arch/x86 adaptation ]

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:17 +02:00
Stephane Eranian
0f8e45a288 i386: make Oprofile call shutdown() only once per session
Oprofile: call model->shutdown() only once to avoid calling release_ev*()
multiple times

[ tglx: arch/x86 adaptation ]

Signed-off-by: Stephane Eranian <eranian@hpl.hp.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2007-10-17 20:15:14 +02:00
Thomas Gleixner
3dfbc88464 x86: C1E late detection fix. Really switch off lapic timer
Doh, I completely missed that devices marked DUMMY are not running
the set_mode function. So we force broadcasting, but we keep the
local APIC timer running.

Let the clock event layer mark the device _after_ switching it off.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2007-10-17 20:15:13 +02:00
Linus Torvalds
fb9fc39517 Merge branch 'xen-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen
* 'xen-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
  xfs: eagerly remove vmap mappings to avoid upsetting Xen
  xen: add some debug output for failed multicalls
  xen: fix incorrect vcpu_register_vcpu_info hypercall argument
  xen: ask the hypervisor how much space it needs reserved
  xen: lock pte pages while pinning/unpinning
  xen: deal with stale cr3 values when unpinning pagetables
  xen: add batch completion callbacks
  xen: yield to IPI target if necessary
  Clean up duplicate includes in arch/i386/xen/
  remove dead code in pgtable_cache_init
  paravirt: clean up lazy mode handling
  paravirt: refactor struct paravirt_ops into smaller pv_*_ops
2007-10-17 11:10:11 -07:00
Linus Torvalds
5c8e191e84 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup
* 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup:
  Remove magic macros for screen_info structure members
  [x86] remove uses of magic macros for boot_params access
2007-10-17 09:00:30 -07:00
Adrian Bunk
cba4fbbff2 remove include/asm-*/ipc.h
All asm/ipc.h files do only #include <asm-generic/ipc.h>.

This patch therefore removes all include/asm-*/ipc.h files and moves the
contents of include/asm-generic/ipc.h to include/linux/ipc.h.

Signed-off-by: Adrian Bunk <bunk@kernel.org>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:55 -07:00
Ken'ichi Ohmichi
bcbba6c10e add-vmcore: add a prefix "VMCOREINFO_" to the vmcoreinfo macros
Add a prefix "VMCOREINFO_" to the vmcoreinfo macros.  Old vmcoreinfo macros
were defined as generic names SYMBOL/SIZE/OFFSET /LENGTH/CONFIG, and it is
impossible to grep for them.  So these names should be changed.  This
discussion is the following:
http://www.ussg.iu.edu/hypermail/linux/kernel/0709.1/0415.html

Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:54 -07:00
Ken'ichi Ohmichi
fd59d231f8 Add vmcoreinfo
This patch set frees the restriction that makedumpfile users should install a
vmlinux file (including the debugging information) into each system.

makedumpfile command is the dump filtering feature for kdump.  It creates a
small dumpfile by filtering unnecessary pages for the analysis.  To
distinguish unnecessary pages, it needs a vmlinux file including the debugging
information.  These days, the debugging package becomes a huge file, and it is
hard to install it into each system.

To solve the problem, kdump developers discussed it at lkml and kexec-ml.  As
the result, we reached the conclusion that necessary information for dump
filtering (called "vmcoreinfo") should be embedded into the first kernel file
and it should be accessed through /proc/vmcore during the second kernel.
(http://www.uwsg.iu.edu/hypermail/linux/kernel/0707.0/1806.html)

Dan Aloni created the patch set for the above implementation.
(http://www.uwsg.iu.edu/hypermail/linux/kernel/0707.1/1053.html)

And I updated it for multi architectures and memory models.
(http://lists.infradead.org/pipermail/kexec/2007-August/000479.html)

Signed-off-by: Dan Aloni <da-x@monatomic.org>
Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Signed-off-by: Bernhard Walle <bwalle@suse.de>
Signed-off-by: Daisuke Nishimura <nishimura@mxp.nes.nec.co.jp>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:54 -07:00
Neil Horman
7dc0b22e3c core_pattern: ignore RLIMIT_CORE if core_pattern is a pipe
For some time /proc/sys/kernel/core_pattern has been able to set its output
destination as a pipe, allowing a user space helper to receive and
intellegently process a core.  This infrastructure however has some
shortcommings which can be enhanced.  Specifically:

1) The coredump code in the kernel should ignore RLIMIT_CORE limitation
   when core_pattern is a pipe, since file system resources are not being
   consumed in this case, unless the user application wishes to save the core,
   at which point the app is restricted by usual file system limits and
   restrictions.

2) The core_pattern code should be able to parse and pass options to the
   user space helper as an argv array.  The real core limit of the uid of the
   crashing proces should also be passable to the user space helper (since it
   is overridden to zero when called).

3) Some miscellaneous bugs need to be cleaned up (specifically the
   recognition of a recursive core dump, should the user mode helper itself
   crash.  Also, the core dump code in the kernel should not wait for the user
   mode helper to exit, since the same context is responsible for writing to
   the pipe, and a read of the pipe by the user mode helper will result in a
   deadlock.

This patch:

Remove the check of RLIMIT_CORE if core_pattern is a pipe.  In the event that
core_pattern is a pipe, the entire core will be fed to the user mode helper.

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
Cc: <martin.pitt@ubuntu.com>
Cc: <wwoods@redhat.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:50 -07:00
Paul E. McKenney
3806204ca9 Remove workaround for unimmunized rcu_dereference from mce_log()
Remove the rmb() from mce_log(), since the immunized version of
rcu_dereference() makes it unnecessary.

Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:46 -07:00
Christoph Lameter
4ba9b9d0ba Slab API: remove useless ctor parameter and reorder parameters
Slab constructors currently have a flags parameter that is never used.  And
the order of the arguments is opposite to other slab functions.  The object
pointer is placed before the kmem_cache pointer.

Convert

        ctor(void *object, struct kmem_cache *s, unsigned long flags)

to

        ctor(struct kmem_cache *s, void *object)

throughout the kernel

[akpm@linux-foundation.org: coupla fixes]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:45 -07:00
Mark Nelson
5b20cd80b4 x86: replace NT_PRXFPREG with ELF_CORE_XFPREG_TYPE #define
Replace NT_PRXFPREG with ELF_CORE_XFPREG_TYPE in the coredump code which
allows for more flexibility in the note type for the state of 'extended
floating point' implementations in coredumps.  New note types can now be
added with an appropriate #define.

This does #define ELF_CORE_XFPREG_TYPE to be NT_PRXFPREG in all
current users so there's are no change in behaviour.

This will let us use different note types on powerpc for the Altivec/VMX
state that some PowerPC cpus have (G4, PPC970, POWER6) and for the SPE
(signal processing extension) state that some embedded PowerPC cpus from
Freescale have.

Signed-off-by: Mark Nelson <markn@au1.ibm.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Andi Kleen <ak@suse.de>
Cc: <linux-arch@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17 08:42:44 -07:00
H. Peter Anvin
30c826451d [x86] remove uses of magic macros for boot_params access
Instead of using magic macros for boot_params access, simply use the
boot_params structure.

Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2007-10-16 17:38:31 -07:00
Linus Torvalds
f563d53c30 Merge ssh://master.kernel.org/pub/scm/linux/kernel/git/sam/kbuild
* ssh://master.kernel.org/pub/scm/linux/kernel/git/sam/kbuild:
  x86: fix boot error introduced by kbuild
2007-10-16 16:51:11 -07:00
Sam Ravnborg
7eebb93486 x86: fix boot error introduced by kbuild
x86 uses target specific assignment of EXTRA_AFLAGS,
EXTRA_CFLAGS - this caused troubles with
introducing asflags-y, ccflags-y.

Fixed the target specific assignments in arch/x86/boot/Makefile
and auditted the rest of the kernel for similar usage.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-10-16 23:50:33 +02:00
Jeremy Fitzhardinge
a122d6230e xen: add some debug output for failed multicalls
Multicalls are expected to never fail, and the normal response to a
failed multicall is very terse.  In the interests of better
debuggability, add some more verbose output.  It may be worth turning
this off once it all seems more tested.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-10-16 11:51:31 -07:00
Jeremy Fitzhardinge
e3d2697669 xen: fix incorrect vcpu_register_vcpu_info hypercall argument
The kernel's copy of struct vcpu_register_vcpu_info was out of date,
at best causing the hypercall to fail and the guest kernel to fall
back to the old mechanism, or worse, causing random memory corruption.

[ Stable folks: applies to 2.6.23 ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Stable Kernel <stable@kernel.org>
Cc: Morten =?utf-8?q?B=C3=B8geskov?= <xen-users@morten.bogeskov.dk>
Cc: Mark Williamson <mark.williamson@cl.cam.ac.uk>
2007-10-16 11:51:31 -07:00
Jeremy Fitzhardinge
fb1d84043c xen: ask the hypervisor how much space it needs reserved
Ask the hypervisor how much space it needs reserved, since 32-on-64
doesn't need any space, and it may change in future.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-10-16 11:51:31 -07:00
Jeremy Fitzhardinge
74260714c5 xen: lock pte pages while pinning/unpinning
When a pagetable is created, it is made globally visible in the rmap
prio tree before it is pinned via arch_dup_mmap(), and remains in the
rmap tree while it is unpinned with arch_exit_mmap().

This means that other CPUs may race with the pinning/unpinning
process, and see a pte between when it gets marked RO and actually
pinned, causing any pte updates to fail with write-protect faults.

As a result, all pte pages must be properly locked, and only unlocked
once the pinning/unpinning process has finished.

In order to avoid taking spinlocks for the whole pagetable - which may
overflow the PREEMPT_BITS portion of preempt counter - it locks and pins
each pte page individually, and then finally pins the whole pagetable.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Hugh Dickens <hugh@veritas.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andi Kleen <ak@suse.de>
Cc: Keir Fraser <keir@xensource.com>
Cc: Jan Beulich <jbeulich@novell.com>
2007-10-16 11:51:30 -07:00
Jeremy Fitzhardinge
9f79991d41 xen: deal with stale cr3 values when unpinning pagetables
When a pagetable is no longer in use, it must be unpinned so that its
pages can be freed.  However, this is only possible if there are no
stray uses of the pagetable.  The code currently deals with all the
usual cases, but there's a rare case where a vcpu is changing cr3, but
is doing so lazily, and the change hasn't actually happened by the time
the pagetable is unpinned, even though it appears to have been completed.

This change adds a second per-cpu cr3 variable - xen_current_cr3 -
which tracks the actual state of the vcpu cr3.  It is only updated once
the actual hypercall to set cr3 has been completed.  Other processors
wishing to unpin a pagetable can check other vcpu's xen_current_cr3
values to see if any cross-cpu IPIs are needed to clean things up.

[ Stable folks: 2.6.23 bugfix ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Stable Kernel <stable@kernel.org>
2007-10-16 11:51:30 -07:00
Jeremy Fitzhardinge
91e0c5f3da xen: add batch completion callbacks
This adds a mechanism to register a callback function to be called once
a batch of hypercalls has been issued.  This is typically used to unlock
things which must remain locked until the hypercall has taken place.

[ Stable folks: pre-req for 2.6.23 bugfix "xen: deal with stale cr3
  values when unpinning pagetables" ]

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Stable Kernel <stable@kernel.org>
2007-10-16 11:51:30 -07:00
Jeremy Fitzhardinge
f0d7339427 xen: yield to IPI target if necessary
When sending a call-function IPI to a vcpu, yield if the vcpu isn't
running.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-10-16 11:51:30 -07:00
Jesper Juhl
d626a1f1cb Clean up duplicate includes in arch/i386/xen/
This patch cleans up duplicate includes in
	arch/i386/xen/

Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
2007-10-16 11:51:29 -07:00
Jeremy Fitzhardinge
4f81784774 remove dead code in pgtable_cache_init
The conversion from using a slab cache to quicklist left some residual
dead code.

I note that in the conversion it now always allocates a whole page for
the pgd, rather than the 32 bytes needed for a PAE pgd.  Was this
intended?

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
2007-10-16 11:51:29 -07:00
Jeremy Fitzhardinge
8965c1c095 paravirt: clean up lazy mode handling
Currently, the set_lazy_mode pv_op is overloaded with 5 functions:
 1. enter lazy cpu mode
 2. leave lazy cpu mode
 3. enter lazy mmu mode
 4. leave lazy mmu mode
 5. flush pending batched operations

This complicates each paravirt backend, since it needs to deal with
all the possible state transitions, handling flushing, etc. In
particular, flushing is quite distinct from the other 4 functions, and
seems to just cause complication.

This patch removes the set_lazy_mode operation, and adds "enter" and
"leave" lazy mode operations on mmu_ops and cpu_ops.  All the logic
associated with enter and leaving lazy states is now in common code
(basically BUG_ONs to make sure that no mode is current when entering
a lazy mode, and make sure that the mode is current when leaving).
Also, flush is handled in a common way, by simply leaving and
re-entering the lazy mode.

The result is that the Xen, lguest and VMI lazy mode implementations
are much simpler.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Zach Amsden <zach@vmware.com>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Anthony Liguory <aliguori@us.ibm.com>
Cc: "Glauber de Oliveira Costa" <glommer@gmail.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
2007-10-16 11:51:29 -07:00
Jeremy Fitzhardinge
93b1eab3d2 paravirt: refactor struct paravirt_ops into smaller pv_*_ops
This patch refactors the paravirt_ops structure into groups of
functionally related ops:

pv_info - random info, rather than function entrypoints
pv_init_ops - functions used at boot time (some for module_init too)
pv_misc_ops - lazy mode, which didn't fit well anywhere else
pv_time_ops - time-related functions
pv_cpu_ops - various privileged instruction ops
pv_irq_ops - operations for managing interrupt state
pv_apic_ops - APIC operations
pv_mmu_ops - operations for managing pagetables

There are several motivations for this:

1. Some of these ops will be general to all x86, and some will be
   i386/x86-64 specific.  This makes it easier to share common stuff
   while allowing separate implementations where needed.

2. At the moment we must export all of paravirt_ops, but modules only
   need selected parts of it.  This allows us to export on a case by case
   basis (and also choose which export license we want to apply).

3. Functional groupings make things a bit more readable.

Struct paravirt_ops is now only used as a template to generate
patch-site identifiers, and to extract function pointers for inserting
into jmp/calls when patching.  It is only instantiated when needed.

Signed-off-by: Jeremy Fitzhardinge <jeremy@xensource.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Cc: Andi Kleen <ak@suse.de>
Cc: Zach Amsden <zach@vmware.com>
Cc: Avi Kivity <avi@qumranet.com>
Cc: Anthony Liguory <aliguori@us.ibm.com>
Cc: "Glauber de Oliveira Costa" <glommer@gmail.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
2007-10-16 11:51:29 -07:00
Linus Torvalds
821f3eff7c Merge git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild
* git://git.kernel.org/pub/scm/linux/kernel/git/sam/kbuild: (40 commits)
  kbuild: introduce ccflags-y, asflags-y and ldflags-y
  kbuild: enable 'make CPPFLAGS=...' to add additional options to CPP
  kbuild: enable use of AFLAGS and CFLAGS on commandline
  kbuild: enable 'make AFLAGS=...' to add additional options to AS
  kbuild: fix AFLAGS use in h8300 and m68knommu
  kbuild: check for wrong use of CFLAGS
  kbuild: enable 'make CFLAGS=...' to add additional options to CC
  kbuild: fix up CFLAGS usage
  kbuild: make modpost detect unterminated device id lists
  kbuild: call export_report from the Makefile
  kbuild: move Kai Germaschewski to CREDITS
  kconfig/menuconfig: distinguish between selected-by-another options and comments
  kconfig: tristate choices with mixed tristate and boolean values
  include/linux/Kbuild: remove duplicate entries
  kbuild: kill backward compatibility checks
  kbuild: kill EXTRA_ARFLAGS
  kbuild: fix documentation in makefiles.txt
  kbuild: call make once for all targets when O=.. is used
  kbuild: pass -g to assembler under CONFIG_DEBUG_INFO
  kbuild: update _shipped files for kconfig syntax cleanup
  ...

Fix up conflicts in arch/um/sys-{x86_64,i386}/Makefile manually.
2007-10-16 11:23:06 -07:00
Linus Torvalds
92d15c2ccb Merge branch 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block
* 'for-linus' of git://git.kernel.dk/data/git/linux-2.6-block: (63 commits)
  Fix memory leak in dm-crypt
  SPARC64: sg chaining support
  SPARC: sg chaining support
  PPC: sg chaining support
  PS3: sg chaining support
  IA64: sg chaining support
  x86-64: enable sg chaining
  x86-64: update pci-gart iommu to sg helpers
  x86-64: update nommu to sg helpers
  x86-64: update calgary iommu to sg helpers
  swiotlb: sg chaining support
  i386: enable sg chaining
  i386 dma_map_sg: convert to using sg helpers
  mmc: need to zero sglist on init
  Panic in blk_rq_map_sg() from CCISS driver
  remove sglist_len
  remove blk_queue_max_phys_segments in libata
  revert sg segment size ifdefs
  Fixup u14-34f ENABLE_SG_CHAINING
  qla1280: enable use_sg_chaining option
  ...
2007-10-16 10:09:16 -07:00
Masami Hiramatsu
f438d914b2 kprobes: support kretprobe blacklist
Introduce architecture dependent kretprobe blacklists to prohibit users
from inserting return probes on the function in which kprobes can be
inserted but kretprobes can not.

This patch also removes "__kprobes" mark from "__switch_to" on x86_64 and
registers "__switch_to" to the blacklist on x86-64, because that mark is to
prohibit user from inserting only kretprobe.

Signed-off-by: Masami Hiramatsu <mhiramat@redhat.com>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Acked-by: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:43:10 -07:00
KAMEZAWA Hiroyuki
48e94196a5 fix memory hot remove not configured case.
Now, arch dependent code around CONFIG_MEMORY_HOTREMOVE is a mess.
This patch cleans up them. This is against 2.6.23-rc6-mm1.

 - fix compile failure on ia64/ CONFIG_MEMORY_HOTPLUG && !CONFIG_MEMORY_HOTREMOVE case.
 - For !CONFIG_MEMORY_HOTREMOVE, add generic no-op remove_memory(),
   which returns -EINVAL.
 - removed remove_pages() only used in powerpc.
 - removed no-op remove_memory() in i386, sh, sparc64, x86_64.

 - only powerpc returns -ENOSYS at memory hot remove(no-op). changes it
   to return -EINVAL.

Note:
Currently, only ia64 supports CONFIG_MEMORY_HOTREMOVE. I welcome other
archs if there are requirements and testers.

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:43:02 -07:00
Will Schmidt
dcca2bde4f During VM oom condition, kill all threads in process group
We have had complaints where a threaded application is left in a bad state
after one of it's threads is killed when we hit a VM: out_of_memory
condition.

Killing just one of the process threads can leave the application in a bad
state, whereas killing the entire process group would allow for the
application to restart, or be otherwise handled, and makes it very obvious
that something has gone wrong.

This change allows the entire process group to be taken down, rather
than just the one thread.

Signed-off-by: Will Schmidt <will_schmidt@vnet.ibm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ian Molton <spyro@f2s.com>
Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Roman Zippel <zippel@linux-m68k.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Matthew Wilcox <willy@debian.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Cc: Richard Curnow <rc@rc0.org.uk>
Cc: William Lee Irwin III <wli@holomorphy.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Chris Zankel <chris@zankel.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:52 -07:00
Christoph Lameter
0889eba5b3 x86_64: SPARSEMEM_VMEMMAP 2M page size support
x86_64 uses 2M page table entries to map its 1-1 kernel space.  We also
implement the virtual memmap using 2M page table entries.  So there is no
additional runtime overhead over FLATMEM, initialisation is slightly more
complex.  As FLATMEM still references memory to obtain the mem_map pointer and
SPARSEMEM_VMEMMAP uses a compile time constant, SPARSEMEM_VMEMMAP should be
superior.

With this SPARSEMEM becomes the most efficient way of handling virt_to_page,
pfn_to_page and friends for UP, SMP and NUMA on x86_64.

[apw@shadowen.org: code resplit, style fixups]
[apw@shadowen.org: vmemmap x86_64: ensure end of section memmap is initialised]
Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Acked-by: Mel Gorman <mel@csn.ul.ie>
Cc: Andi Kleen <ak@suse.de>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:51 -07:00
Christoph Hellwig
74a0b57627 x86: optimize page faults like all other achitectures and kill notifier cruft
x86(-64) are the last architectures still using the page fault notifier
cruft for the kprobes page fault hook.  This patch converts them to the
proper direct calls, and removes the now unused pagefault notifier bits
aswell as the cruft in kprobes.c that was related to this mess.

I know Andi didn't really like this, but all other architecture maintainers
agreed the direct calls are much better and besides the obvious cruft
removal a common way of dealing with kprobes across architectures is
important aswell.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: fix sparc64]
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Andi Kleen <ak@suse.de>
Cc: <linux-arch@vger.kernel.org>
Cc: Prasanna S Panchamukhi <prasanna@in.ibm.com>
Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com>
Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:50 -07:00
Mike Travis
d5a7430ddc Convert cpu_sibling_map to be a per cpu variable
Convert cpu_sibling_map from a static array sized by NR_CPUS to a per_cpu
variable.  This saves sizeof(cpumask_t) * NR unused cpus.  Access is mostly
from startup and CPU HOTPLUG functions.

Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:50 -07:00
Mike Travis
0835761129 x86: Convert cpu_core_map to be a per cpu variable
This is from an earlier message from 'Christoph Lameter':

    cpu_core_map is currently an array defined using NR_CPUS. This means that
    we overallocate since we will rarely really use maximum configured cpu.

    If we put the cpu_core_map into the per cpu area then it will be allocated
    for each processor as it comes online.

    This means that the core map cannot be accessed until the per cpu area
    has been allocated. Xen does a weird thing here looping over all processors
    and zeroing the masks that are not yet allocated and that will be zeroed
    when they are allocated. I commented the code out.

Signed-off-by: Christoph Lameter <clameter@sgi.com>
Signed-off-by: Mike Travis <travis@sgi.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Christoph Lameter <clameter@sgi.com>
Cc: "Siddha, Suresh B" <suresh.b.siddha@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:50 -07:00
Alexey Dobriyan
1bcf548293 Consolidate PTRACE_DETACH
Identical handlers of PTRACE_DETACH go into ptrace_request().
Not touching compat code.
Not touching archs that don't call ptrace_request.

Signed-off-by: Alexey Dobriyan <adobriyan@sw.ru>
Acked-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-16 09:42:49 -07:00
Jens Axboe
9ee1bea469 x86-64: update pci-gart iommu to sg helpers
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:26:02 +02:00
Jens Axboe
b922f53bd0 x86-64: update nommu to sg helpers
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:26:02 +02:00
Jens Axboe
8b87d9f48a x86-64: update calgary iommu to sg helpers
Acked-by: Muli Ben-Yehuda <muli@il.ibm.com>
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-10-16 11:26:02 +02:00
Sam Ravnborg
222d394d30 kbuild: enable 'make AFLAGS=...' to add additional options to AS
The variable AFLAGS is a wellknown variable and the usage by
kbuild may result in unexpected behaviour.
On top of that several people over time has asked for a way to
pass in additional flags to gcc.

This patch replace use of AFLAGS with KBUILD_AFLAGS all over
the tree.

Patch was tested on following architectures:
alpha, arm, i386, x86_64, mips, sparc, sparc64, ia64, m68k, s390

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-10-15 21:59:31 +02:00
Sam Ravnborg
a0f97e06a4 kbuild: enable 'make CFLAGS=...' to add additional options to CC
The variable CFLAGS is a wellknown variable and the usage by
kbuild may result in unexpected behaviour.
On top of that several people over time has asked for a way to
pass in additional flags to gcc.

This patch replace use of CFLAGS with KBUILD_CFLAGS all over the
tree and enabling one to use:
make CFLAGS=...
to specify additional gcc commandline options.

One usecase is when trying to find gcc bugs but other
use cases has been requested too.

Patch was tested on following architectures:
alpha, arm, i386, x86_64, mips, sparc, sparc64, ia64, m68k

Test was simple to do a defconfig build, apply the patch and check
that nothing got rebuild.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-10-14 22:21:35 +02:00
Sam Ravnborg
9a39e273d4 kbuild: fix up CFLAGS usage
Only in very rare cases is it needed to change CFLAGS
outside of arch/*/Makefile.
Fix up all wrong cases - in most cases
the use of EXTRA_CFLAGS is the only thing needed.

Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2007-10-14 21:49:42 +02:00
Linus Torvalds
6a84258e5f Merge master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6
* master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (37 commits)
  PCI: merge almost all of pci_32.h and pci_64.h together
  PCI: X86: Introduce and enable PCI domain support
  PCI: Add 'nodomains' boot option, and pci_domains_supported global
  PCI: modify PCI bridge control ISA flag for clarity
  PCI: use _CRS for PCI resource allocation
  PCI: avoid P2P prefetch window for expansion ROMs
  PCI: skip ISA ioresource alignment on some systems
  PCI: remove transparent bridge sizing
  pci: write file size to inode on proc bus file write
  pci: use size stored in proc_dir_entry for proc bus files
  pci: implement "pci=noaer"
  PCI: fix IDE legacy mode resources
  MSI: Use correct data offset for 32-bit MSI in read_msi_msg()
  PCI: Fix incorrect argument order to list_add_tail() in PCI dynamic ID code
  PCI: i386: Compaq EVO N800c needs PCI bus renumbering
  PCI: Remove no longer correct documentation regarding MSI vector assignment
  PCI: re-enable onboard sound on "MSI K8T Neo2-FIR"
  PCI: quirk_vt82c586_acpi: Omit reading PCI revision ID
  PCI: quirk amd_8131_mmrbc: Omit reading pci revision ID
  cpqphp: Use PCI_CLASS_REVISION instead of PCI_REVISION_ID for read
  ...
2007-10-12 15:50:23 -07:00
Linus Torvalds
4d5709a7b7 Merge master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq
* master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] Don't take semaphore in cpufreq_quick_get()
  [CPUFREQ] Support different families in fid/did to frequency conversion
  [CPUFREQ] cpufreq_stats: misc cpuinit section annotations
  [CPUFREQ] implement !CONFIG_CPU_FREQ stub for  cpufreq_unregister_notifier()
  [CPUFREQ] mark hotplug notifier callback as __cpuinit
  [CPUFREQ] Only check for transition latency on problematic governors (kconfig fix)
  [CPUFREQ] allow ondemand and conservative cpufreq governors to be used as default
  [CPUFREQ] move policy's governor initialisation out of low-level drivers into cpufreq core
  [CPUFREQ] Longhaul - Add support for PM133 northbridge
  [CPUFREQ] x86: use num_online_nodes to get physical cpus numbers for
2007-10-12 15:42:01 -07:00
Jeff Garzik
a79e4198d1 PCI: X86: Introduce and enable PCI domain support
* fix bug in pci_read() and pci_write() which prevented PCI domain
  support from working (hardcoded domain 0).

* unconditionally enable CONFIG_PCI_DOMAINS

* implement pci_domain_nr() and pci_proc_domain(), as required of
  all arches when CONFIG_PCI_DOMAINS is enabled.

* store domain in struct pci_sysdata, as assigned by ACPI

* support "pci=nodomains"

Signed-off-by: Jeff Garzik <jgarzik@redhat.com>
Cc: Andi Kleen <ak@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:19 -07:00
Gary Hade
11949255d9 PCI: modify PCI bridge control ISA flag for clarity
Modify PCI Bridge Control ISA flag for clarity

This patch changes PCI_BRIDGE_CTL_NO_ISA to PCI_BRIDGE_CTL_ISA
and modifies it's clarifying comment and locations where used.
The change reduces the chance of future confusion since it makes
the set/unset meaning of the bit the same in both the bridge
control register and bridge_ctl field of the pci_bus struct.

Signed-off-by: Gary Hade <garyhade@us.ibm.com>
Acked-by: Linas Vepstas <linas@austin.ibm.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2007-10-12 15:03:18 -07:00