12819 Commits

Author SHA1 Message Date
Steven Rostedt
722b3c7469 ftrace/graph: Trace function entry before updating index
Currently the index to the ret_stack is updated and the real return address
is saved in the ret_stack. Then we call the trace function. The trace
function could decide that it doesn't want to trace this function
(ex. set_graph_function does not match) and it will return 0 which means
not to trace this call.

The normal function graph tracer has this code:

	if (!(trace->depth || ftrace_graph_addr(trace->func)) ||
	      ftrace_graph_ignore_irqs())
		return 0;

What this states is, if the trace depth (which is curr_ret_stack)
is zero (top of nested functions) then test if we want to trace this
function. If this function is not to be traced, then return  0 and
the rest of the function graph tracer logic will not trace this function.

The problem arises when an interrupt comes in after we updated the
curr_ret_stack. The next function that gets called will have a trace->depth
of 1. Which fools this trace code into thinking that we are in a nested
function, and that we should trace. This causes interrupts to be traced
when they should not be.

The solution is to trace the function first and then update the ret_stack.

Reported-by: zhiping zhong <xzhong86@163.com>
Reported-by: wu zhangjin <wuzhangjin@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-10 10:34:43 -05:00
David Sharp
d5bf2ff072 tracing: Fix event alignment: kvm:kvm_hv_hypercall
Acked-by: Avi Kivity <avi@redhat.com>
Signed-off-by: David Sharp <dhsharp@google.com>
LKML-Reference: <1291421609-14665-8-git-send-email-dhsharp@google.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2011-03-10 10:34:24 -05:00
Andrea Arcangeli
a79e53d856 x86/mm: Fix pgd_lock deadlock
It's forbidden to take the page_table_lock with the irq disabled
or if there's contention the IPIs (for tlb flushes) sent with
the page_table_lock held will never run leading to a deadlock.

Nobody takes the pgd_lock from irq context so the _irqsave can be
removed.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Acked-by: Rik van Riel <riel@redhat.com>
Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
LKML-Reference: <201102162345.p1GNjMjm021738@imap1.linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-10 09:41:57 +01:00
Andrey Vagin
f86268549f x86/mm: Handle mm_fault_error() in kernel space
mm_fault_error() should not execute oom-killer, if page fault
occurs in kernel space.  E.g. in copy_from_user()/copy_to_user().

This would happen if we find ourselves in OOM on a
copy_to_user(), or a copy_from_user() which faults.

Without this patch, the kernels hangs up in copy_from_user(),
because OOM killer sends SIG_KILL to current process, but it
can't handle a signal while in syscall, then the kernel returns
to copy_from_user(), reexcute current command and provokes
page_fault again.

With this patch the kernel return -EFAULT from copy_from_user().

The code, which checks that page fault occurred in kernel space,
has been copied from do_sigbus().

This situation is handled by the same way on powerpc, xtensa,
tile, ...

Signed-off-by: Andrey Vagin <avagin@openvz.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: <stable@kernel.org>
LKML-Reference: <201103092322.p29NMNPH001682@imap1.linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-10 09:41:40 +01:00
Naga Chumbalkar
1f858ef2fb [CPUFREQ] pcc-cpufreq: don't load driver if get_freq fails during init.
Return 0 on failure. This will cause the initialization of the driver
to fail and prevent the driver from loading if the BIOS cannot handle
the PCC interface command to "get frequency". Otherwise, the driver
will load and display a very high value like "4294967274" (which is
actually -EINVAL) for frequency:

# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
4294967274

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
CC: stable@kernel.org
Signed-off-by: Dave Jones <davej@redhat.com>
2011-03-09 12:33:15 -05:00
Naga Chumbalkar
a7bd1dafdc x86: Don't check for BIOS corruption in first 64K when there's no need to
Due to commit 781c5a67f152c17c3e4a9ed9647f8c0be6ea5ae9 it is
likely that the number of areas to scan for BIOS corruption is 0
 -- especially when the first 64K is already reserved
(X86_RESERVE_LOW is 64K by default).

If that's the case then don't set up the scan.

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
Cc: <stable@kernel.org>
LKML-Reference: <20110225202838.2229.71011.sendpatchset@nchumbalkar.americas.hpqcorp.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-09 16:36:41 +01:00
Cliff Wickman
5471262290 x86, UV: Initialize the broadcast assist unit base destination node id properly
The BAU's initialization of the broadcast description header is
lacking the coherence domain (high bits) in the nasid.  This
causes a catastrophic system failure when running on a system
with multiple coherence domains.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
LKML-Reference: <E1PxKBB-0005F0-3U@eag09.americas.sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-09 16:36:16 +01:00
Jan Beulich
c49aa5bd13 x86: Remove dead config option X86_CPU
This isn't being referenced anywhere, and the selects done from
it can be easily done together with all the other X86 ones.

 v2: Also adjust UML's Kconfig.x86.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4D7603DA02000078000351C1@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-09 10:39:36 +01:00
Ingo Molnar
c8b44163b7 Merge commit 'v2.6.38-rc8' into x86/asm
Merge reason: Update with the latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-09 10:38:59 +01:00
Sedat Dilek
2ae9d293b1 x86: Fix binutils-2.21 symbol related build failures
New binutils version 2.21.0.20110302-1 started checking that the symbol
parameter to the .size directive matches the entry name's
symbol parameter, unearthing two mismatches:

  AS      arch/x86/kernel/acpi/wakeup_rm.o
  arch/x86/kernel/acpi/wakeup_rm.S: Assembler messages:
  arch/x86/kernel/acpi/wakeup_rm.S:12: Error: .size expression with symbol `wakeup_code_start' does not evaluate to a constant

  arch/x86/kernel/entry_32.S: Assembler messages:
  arch/x86/kernel/entry_32.S:1421: Error: .size expression with
  symbol `apf_page_fault' does not evaluate to a constant

The problem was discovered while using Debian's binutils
(2.21.0.20110302-1) and experimenting with binutils from
upstream.

Thanks Alexander and H.J. for the vital help.

Signed-off-by: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
Cc: H.J. Lu <hjl.tools@gmail.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Rafael J. Wysocki <rjw@sisk.pl>
LKML-Reference: <1299620364-21644-1-git-send-email-sedat.dilek@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-09 10:25:45 +01:00
Jiri Olsa
2a8247a260 kprobes: Disabling optimized kprobes for entry text section
You can crash the kernel (with root/admin privileges) using kprobe tracer by running:

 echo "p system_call_after_swapgs" > ./kprobe_events
 echo 1 > ./events/kprobes/enable

The reason is that at the system_call_after_swapgs label, the
kernel stack is not set up. If optimized kprobes are enabled,
the user space stack is being used in this case (see optimized
kprobe template) and this might result in a crash.

There are several places like this over the entry code
(entry_$BIT). As it seems there's no any reasonable/maintainable
way to disable only those places where the stack is not ready, I
switched off the whole entry code from kprobe optimizing.

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: acme@redhat.com
Cc: fweisbec@gmail.com
Cc: ananth@in.ibm.com
Cc: davem@davemloft.net
Cc: a.p.zijlstra@chello.nl
Cc: eric.dumazet@gmail.com
Cc: 2nddept-manager@sdl.hitachi.co.jp
LKML-Reference: <1298298313-5980-3-git-send-email-jolsa@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-08 17:22:12 +01:00
Jiri Olsa
ea7145477a x86: Separate out entry text section
Put x86 entry code into a separate link section: .entry.text.

Separating the entry text section seems to have performance
benefits - caused by more efficient instruction cache usage.

Running hackbench with perf stat --repeat showed that the change
compresses the icache footprint. The icache load miss rate went
down by about 15%:

 before patch:
         19417627  L1-icache-load-misses      ( +-   0.147% )

 after patch:
         16490788  L1-icache-load-misses      ( +-   0.180% )

The motivation of the patch was to fix a particular kprobes
bug that relates to the entry text section, the performance
advantage was discovered accidentally.

Whole perf output follows:

 - results for current tip tree:

  Performance counter stats for './hackbench/hackbench 10' (500 runs):

         19417627  L1-icache-load-misses      ( +-   0.147% )
       2676914223  instructions             #      0.497 IPC     ( +- 0.079% )
       5389516026  cycles                     ( +-   0.144% )

      0.206267711  seconds time elapsed   ( +-   0.138% )

 - results for current tip tree with the patch applied:

  Performance counter stats for './hackbench/hackbench 10' (500 runs):

         16490788  L1-icache-load-misses      ( +-   0.180% )
       2717734941  instructions             #      0.502 IPC     ( +- 0.079% )
       5414756975  cycles                     ( +-   0.148% )

      0.206747566  seconds time elapsed   ( +-   0.137% )

Signed-off-by: Jiri Olsa <jolsa@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Piggin <npiggin@kernel.dk>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: masami.hiramatsu.pt@hitachi.com
Cc: ananth@in.ibm.com
Cc: davem@davemloft.net
Cc: 2nddept-manager@sdl.hitachi.co.jp
LKML-Reference: <20110307181039.GB15197@jolsa.redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-08 17:22:11 +01:00
Ingo Molnar
86cb2ec7b2 Merge commit 'v2.6.38-rc8' into perf/core
Merge reason: Merge latest fixes.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-08 17:21:52 +01:00
David Howells
ee009e4a0d KEYS: Add an iovec version of KEYCTL_INSTANTIATE
Add a keyctl op (KEYCTL_INSTANTIATE_IOV) that is like KEYCTL_INSTANTIATE, but
takes an iovec array and concatenates the data in-kernel into one buffer.
Since the KEYCTL_INSTANTIATE copies the data anyway, this isn't too much of a
problem.

Signed-off-by: David Howells <dhowells@redhat.com>
Signed-off-by: James Morris <jmorris@namei.org>
2011-03-08 11:17:22 +11:00
Jan Beulich
ac23f25355 x86: Really print supported CPUs if PROCESSOR_SELECT=y
I'm sure it was a mere oversight that the CONFIG_ prefixes are
missing.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Dave Jones <davej@redhat.com>
LKML-Reference: <4D7118D30200007800034F79@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-05 09:29:45 +01:00
Ingo Molnar
ca764aaf02 Merge branch 'x86-mm' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/misc into x86/mm 2011-03-05 07:32:45 +01:00
Lin Ming
6909262429 perf: Avoid the percore allocations if the CPU is not HT capable
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299119690-13991-5-git-send-email-ming.m.lin@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-05 07:12:16 +01:00
Tejun Heo
078a198906 x86-64, NUMA: Don't assume phys node 0 is always online in numa_emulation()
Undetermined entries in emu_nid_to_phys[] are filled with zero
assuming that physical node 0 is always online; however, this might
not be true depending on hardware configuration.  Find a physical node
which is actually online and use it instead.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: David Rientjes <rientjes@google.com>
LKML-Reference: <alpine.DEB.2.00.1103020628210.31626@chino.kir.corp.google.com>
2011-03-04 16:32:37 +01:00
Yinghai Lu
3b28cf32cc x86, numa: Fix numa_emulation code with memory-less node0
This crash happens on a system that does not have RAM on node0.

When numa_emulation is compiled in, and:

 1. we boot the system without numa=fake...
 2. or we boot the system with numa=fake=128 to make emulation fail

we will get:

[    0.076025] ------------[ cut here ]------------
[    0.080004] kernel BUG at arch/x86/mm/numa_64.c:788!
[    0.080004] invalid opcode: 0000 [#1] SMP
[...]

need to use early_cpu_to_node() directly, because cpu_to_apicid
and apicid_to_node will return node0 that is not onlined.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
LKML-Reference: <4D6ECF72.5010308@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04 15:20:19 +01:00
David Rientjes
c09cedf4f7 x86-64, NUMA: Clean up initmem_init()
This patch cleans initmem_init() so that it is more readable and doesn't
use an unnecessary array of function pointers to convolute the flow of
the code.  It also makes it obvious that dummy_numa_init() will always
succeed (and documents that requirement) so that the existing BUG() is
never actually reached.

No functional change.

-tj: Updated comment for dummy_numa_init() slightly.

Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2011-03-04 15:17:21 +01:00
Yinghai Lu
51b361b400 x86-64, NUMA: Fix numa_emulation code with node0 without RAM
On one system that does not have RAM on node0.

When numa_emulation is compiled in, and
1. boot system without numa=fake...
2. or boot system with numa=fake=128 to make emulation fail

will get:

[    0.092026] ------------[ cut here ]------------
[    0.096005] kernel BUG at arch/x86/mm/numa_emulation.c:439!
[    0.096005] invalid opcode: 0000 [#1] SMP
[    0.096005] last sysfs file:
[    0.096005] CPU 0
[    0.096005] Modules linked in:
[    0.096005]
[    0.096005] Pid: 0, comm: swapper Not tainted 2.6.38-rc6-tip-yh-03869-gcb0491d-dirty #684 Sun Microsystems     Sun Fire X4240/Sun Fire X4240
[    0.096005] RIP: 0010:[<ffffffff81cdc65b>]  [<ffffffff81cdc65b>] numa_add_cpu+0x56/0xcf
[    0.096005] RSP: 0000:ffffffff82437ed8  EFLAGS: 00010246
...
[    0.096005] Call Trace:
[    0.096005]  [<ffffffff81cd7931>] identify_cpu+0x2d7/0x2df
[    0.096005]  [<ffffffff827e54fa>] identify_boot_cpu+0x10/0x30
[    0.096005]  [<ffffffff827e5704>] check_bugs+0x9/0x2d
[    0.096005]  [<ffffffff827dceda>] start_kernel+0x3d7/0x3f1
[    0.096005]  [<ffffffff827dc2cc>] x86_64_start_reservations+0x9c/0xa0
[    0.096005]  [<ffffffff827dc4ad>] x86_64_start_kernel+0x1dd/0x1e8
[    0.096005] Code: 74 06 48 8d 04 90 eb 0f 48 c7 c0 30 d9 00 00 48 03 04 d5 90 0f 60 82 8b 00 83 f8 ff 74 0d 0f a3 05 8b 7e 92 00 19 d2 85 d2 75 02 <0f> 0b 48 98 be 00 01 00 00 48 c7 c7 e0 44 60 82 44 8b 2c 85 e0
[    0.096005] RIP  [<ffffffff81cdc65b>] numa_add_cpu+0x56/0xcf
[    0.096005]  RSP <ffffffff82437ed8>
[    0.096026] ---[ end trace a7919e7f17c0a725 ]---

We need to use early_cpu_to_node() directly, because numa_cpu_node()
will return node0 that is not onlined.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
2011-03-04 14:49:28 +01:00
Andi Kleen
e994d7d23a perf: Fix LLC-* events on Intel Nehalem/Westmere
On Intel Nehalem and Westmere CPUs the generic perf LLC-* events count the
L2 caches, not the real L3 LLC - this was inconsistent with behavior on
other CPUs.

Fixing this requires the use of the special OFFCORE_RESPONSE
events which need a separate mask register.

This has been implemented by the previous patch, now use this infrastructure
to set correct events for the LLC-* on Nehalem and Westmere.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299119690-13991-3-git-send-email-ming.m.lin@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04 11:32:53 +01:00
Andi Kleen
a7e3ed1e47 perf: Add support for supplementary event registers
Change logs against Andi's original version:

- Extends perf_event_attr:config to config{,1,2} (Peter Zijlstra)
- Fixed a major event scheduling issue. There cannot be a ref++ on an
  event that has already done ref++ once and without calling
  put_constraint() in between. (Stephane Eranian)
- Use thread_cpumask for percore allocation. (Lin Ming)
- Use MSR names in the extra reg lists. (Lin Ming)
- Remove redundant "c = NULL" in intel_percore_constraints
- Fix comment of perf_event_attr::config1

Intel Nehalem/Westmere have a special OFFCORE_RESPONSE event
that can be used to monitor any offcore accesses from a core.
This is a very useful event for various tunings, and it's
also needed to implement the generic LLC-* events correctly.

Unfortunately this event requires programming a mask in a separate
register. And worse this separate register is per core, not per
CPU thread.

This patch:

- Teaches perf_events that OFFCORE_RESPONSE needs extra parameters.
  The extra parameters are passed by user space in the
  perf_event_attr::config1 field.

- Adds support to the Intel perf_event core to schedule per
  core resources. This adds fairly generic infrastructure that
  can be also used for other per core resources.
  The basic code has is patterned after the similar AMD northbridge
  constraints code.

Thanks to Stephane Eranian who pointed out some problems
in the original version and suggested improvements.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1299119690-13991-2-git-send-email-ming.m.lin@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04 11:32:53 +01:00
Stephane Eranian
17e3162972 perf_events: Update PEBS event constraints
This patch updates PEBS event constraints for Intel Atom, Nehalem, Westmere.

This patch also reorganizes the PEBS format/constraint detection code. It is
now based on processor model and not PEBS format. Two processors may use the
same PEBS format without have the same list of PEBS events.

In this second version, we simplified the initialization of the PEBS
constraints by leveraging the existing switch() statement in perf_event_intel.c.
We also renamed the constraint tables to be more consistent with regular
constraints.

In this 3rd version, we drop BR_INST_RETIRED.MISPRED from Intel Atom as it does
not seem to work. Use MISPREDICTED_BRANCH_RETIRED instead. Also add FP_ASSIST.*
o both Intel Nehalem and Westmere. I misssed those in the earlier patches.
Events were tested using libpfm4 perf_examples.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4d6e6b02.815bdf0a.637b.07a7@mx.google.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04 11:32:52 +01:00
Ingo Molnar
888a8a3e9d Merge branch 'perf/urgent' into perf/core
Merge reason: Pick up updates before queueing up dependent patches.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-04 10:40:25 +01:00
Tejun Heo
f891125028 x86-64, NUMA: Revert NUMA affine page table allocation
This patch reverts NUMA affine page table allocation added by commit
1411e0ec31 (x86-64, numa: Put pgtable to local node memory).

The commit made an undocumented change where the kernel linear mapping
strictly follows intersection of e820 memory map and NUMA
configuration.  If the physical memory configuration has holes or NUMA
nodes are not properly aligned, this leads to using unnecessarily
smaller mapping size which leads to increased TLB pressure.  For
details,

  http://thread.gmane.org/gmane.linux.kernel/1104672

Patches to fix the problem have been proposed but the underlying code
needs more cleanup and the approach itself seems a bit heavy handed
and it has been determined to revert the feature for now and come back
to it in the next developement cycle.

  http://thread.gmane.org/gmane.linux.kernel/1105959

As init_memory_mapping_high() callsites have been consolidated since
the commit, reverting is done manually.  Also, the RED-PEN comment in
arch/x86/mm/init.c is not restored as the problem no longer exists
with memblock based top-down early memory allocation.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
2011-03-04 10:26:36 +01:00
Ian Campbell
f611f2da99 xen/timer: Missing IRQF_NO_SUSPEND in timer code broke suspend.
The patches missed an indirect use of IRQF_NO_SUSPEND pulled in via
IRQF_TIMER. The following patch fixes the issue.

With this fixlet PV guest migration works just fine. I also booted the
entire series as a dom0 kernel and it appeared fine.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-03 12:00:31 -05:00
Ian Campbell
3f2a230caf xen: handled remapped IRQs when enabling a pcifront PCI device.
This happens to not be an issue currently because we take pains to try
to ensure that the GSI-IRQ mapping is 1-1 in a PV guest and that
regular event channels do not clash. However a subsequent patch is
going to break this 1-1 mapping.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
2011-03-03 11:56:57 -05:00
Konrad Rzeszutek Wilk
6eaa412f27 xen: Mark all initial reserved pages for the balloon as INVALID_P2M_ENTRY.
With this patch, we diligently set regions that will be used by the
balloon driver to be INVALID_P2M_ENTRY and under the ownership
of the balloon driver. We are OK using the __set_phys_to_machine
as we do not expect to be allocating any P2M middle or entries pages.
The set_phys_to_machine has the side-effect of potentially allocating
new pages and we do not want that at this stage.

We can do this because xen_build_mfn_list_list will have already
allocated all such pages up to xen_max_p2m_pfn.

We also move the check for auto translated physmap down the
stack so it is present in __set_phys_to_machine.

[v2: Rebased with mmu->p2m code split]
Reviewed-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
2011-03-03 11:52:48 -05:00
Borislav Petkov
84fd1d35cc x86, amd-nb: Misc cleanliness fixes
Make functions used strictly in bool context return bool. Also,
fixup used types and comments, and make a local function static,
while at it.

Signed-off-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Borislav Petkov <bp@amd64.org>
LKML-Reference: <20110303115932.GA8603@aftab>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-03 13:06:20 +01:00
Jan Beulich
d04c579f97 x86: Work around old gas bug
Add extra parentheses around a couple of definitions introduced
by "x86: Cleanup vector usage" and used in assembly macro
arguments, and remove spaces. Without that old (2.16.1) gas
would see more macro arguments than were actually specified.

Reported-and-tested-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Shaohua Li <shaohua.li@intel.com>
LKML-Reference: <4D6F81B10200007800034B0B@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-03 12:47:08 +01:00
Linus Torvalds
f7d222ea2a Merge branch 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6
* 'devicetree/merge' of git://git.secretlab.ca/git/linux-2.6:
  of/promtree: allow DT device matching by fixing 'name' brokenness (v5)
  x86: OLPC: have prom_early_alloc BUG rather than return NULL
  of/flattree: Drop an uninteresting message to pr_debug level
  of: Add missing of_address.h to xilinx ehci driver
2011-03-02 20:01:57 -08:00
Linus Torvalds
ebff7c92ab Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
  [CPUFREQ] p4-clockmod: print EST-capable warning message only once
  [CPUFREQ] fix BUG on cpufreq policy init failure
  [CPUFREQ] Fix another notifier leak in powernow-k8.
  [CPUFREQ] Missing "unregister_cpu_notifier" in powernow-k8.c
2011-03-02 19:58:31 -08:00
Linus Torvalds
c7b01d3dc2 Merge branch 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6
* 'idle-release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-idle-2.6:
  intel_idle: disable Atom/Lincroft HW C-state auto-demotion
  intel_idle: disable NHM/WSM HW C-state auto-demotion
2011-03-02 18:08:03 -08:00
Andres Salomon
60cba5a57b x86: OLPC: have prom_early_alloc BUG rather than return NULL
..similar to what sparc's prom_early_alloc does.

Signed-off-by: Andres Salomon <dilinger@queued.net>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
2011-03-02 13:45:18 -07:00
Tejun Heo
eb8c1e2c83 x86-64, NUMA: Better explain numa_distance handling
Handling of out-of-bounds distances and allocation failure can use
better documentation.  Add it.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Yinghai Lu <yinghai@kernel.org>
Acked-by: David Rientjes <rientjes@google.com>
2011-03-02 16:34:21 +01:00
Yinghai Lu
ce0033307f x86-64, NUMA: Fix distance table handling
NUMA distance table handling has the following problems.

* numa_reset_distance() uses numa_distance * sizeof(numa_distance[0])
  as the table size when it should be using the square of
  numa_distance.

* The same size miscalculation when allocation space for phys_dist in
  numa_emulation().

* In numa_emulation(), phys_dist must be reserved; otherwise, the new
  emulated distance table may overlap it.

Fix them and, while at it, take numa_distance_cnt resetting in
numa_reset_distance() out of the if block to simplify the code a bit.

David Rientjes reported incorrect handling of distance table during
emulation.

-tj: Edited out numa_alloc_distance() related changes which weren't
     necessary and rewrote patch description.

-v2: Ingo was unhappy with 80-column limit induced linebreaks.  Let
     lines run over 80-column.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Reported-by: David Rientjes <rientjes@google.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Ingo Molnar <mingo@elte.hu>
Acked-by: David Rientjes <rientjes@google.com>
2011-03-02 16:34:09 +01:00
Lin Ming
b06b3d4969 perf, x86: Add Intel SandyBridge CPU support
This patch adds basic SandyBridge support, including hardware
cache events and PEBS events support.

It has been tested on SandyBridge CPUs with perf stat and also
with PEBS based profiling - both work fine.

The patch does not affect other models.

v2 -> v3:
 - fix PEBS event 0xd0 with right umask combinations
 - move snb pebs constraint assignment to intel_pmu_init

v1 -> v2:
 - add more raw and PEBS events constraints
 - use offcore events for LLC-* cache events
 - remove the call to Nehalem workaround enable_all function

Signed-off-by: Lin Ming <ming.m.lin@intel.com>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <andi@firstfloor.org>
LKML-Reference: <1299072424.2175.24.camel@localhost>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-02 14:37:02 +01:00
Jan Beulich
e938c287ea x86: Fix a bogus unwind annotation in lib/semaphore_32.S
'simple' would have required specifying current frame address
and return address location manually, but that's obviously not
the case (and not necessary) here.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4D6D1082020000780003454C@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-03-02 08:16:44 +01:00
Daniel J Blueman
6670e9cdaf x86, build: Make sure mkpiggy fails on read error
Ensure build doesn't silently continue despite read failure,
addressing a warning due to the unchecked call.

Signed-off-by: Daniel J Blueman <daniel.blueman@gmail.com>
LKML-Reference: <AANLkTimxxTMU3=4ry-_zbY6v1xiDi+hW9y1RegTr8vLK@mail.gmail.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2011-03-01 16:32:03 -08:00
Naga Chumbalkar
853cee26e2 [CPUFREQ] p4-clockmod: print EST-capable warning message only once
Print the message only once. I see it 16 times on a 2P box with 16 logical CPUs.

Signed-off-by: Naga Chumbalkar <nagananda.chumbalkar@hp.com>
2011-03-01 18:49:45 -05:00
Dave Jones
a536b126f2 [CPUFREQ] Fix another notifier leak in powernow-k8.
Do the notifier registration later, so we don't have to worry
about freeing it if we fail the msr allocation.

Signed-off-by: Dave Jones <davej@redhat.com>
2011-03-01 18:49:44 -05:00
Neil Brown
ac81831449 [CPUFREQ] Missing "unregister_cpu_notifier" in powernow-k8.c
It appears that when powernow-k8 finds that

    No compatible ACPI _PSS objects found.

 and suggests

    Try again with latest BIOS.

 it fails the module load, but does not unregister the cpu_notifier that was
 registered in powernowk8_init

 This ends up leaving freed memory on the cpu notifier list for some other
 poor module (e.g. md/raid5) to come along and trip over.

 The following might be a partial fix, but I suspect there is probably other
 clean-up that is needed.

 ( https://bugzilla.novell.com/show_bug.cgi?id=655215 has full dmesg traces).

Signed-off-by: Dave Jones <davej@redhat.com>
Signed-off-by: Neil Brown <neilb@suse.de>
2011-03-01 18:49:44 -05:00
Jan Beulich
039e13890b x86: Remove unused bits from lib/thunk_*.S
Some of the items removed were apparently never used, others
simply didn't get removed with their last user.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
LKML-Reference: <4D6BD3A002000078000341F1@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-28 18:06:22 +01:00
Jan Beulich
60cf637a13 x86: Use {push,pop}_cfi in more places
Cleaning up and shortening code...

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
LKML-Reference: <4D6BD35002000078000341DA@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-28 18:06:22 +01:00
Jan Beulich
39f2205e1a x86-64: Add CFI annotations to lib/rwsem_64.S
These weren't part of the initial commit of this code.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Cc: Alexander van Heukelum <heukelum@fastmail.fm>
LKML-Reference: <4D6BCDFF02000078000341B0@vpn.id2.novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-28 18:06:21 +01:00
Don Zickus
299c56966a x86: Use u32 instead of long to set reset vector back to 0
A customer of ours, complained that when setting the reset
vector back to 0, it trashed other data and hung their box.
They noticed when only 4 bytes were set to 0 instead of 8,
everything worked correctly.

Mathew pointed out:

 |
 | We're supposed to be resetting trampoline_phys_low and
 | trampoline_phys_high here, which are two 16-bit values.
 | Writing 64 bits is definitely going to overwrite space
 | that we're not supposed to be touching.
 |

So limit the area modified to u32.

Signed-off-by: Don Zickus <dzickus@redhat.com>
Acked-by: Matthew Garrett <mjg@redhat.com>
Cc: <stable@kernel.org>
LKML-Reference: <1297139100-424-1-git-send-email-dzickus@redhat.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-02-28 16:22:18 +01:00
Christoph Lameter
b9ec40af0e percpu, x86: Add arch-specific this_cpu_cmpxchg_double() support
Support this_cpu_cmpxchg_double() using the cmpxchg16b and cmpxchg8b
instructions.

-tj: s/percpu_cmpxchg16b/percpu_cmpxchg16b_double/ for consistency and
     other cosmetic changes.

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2011-02-28 11:20:49 +01:00
Stratos Psomadakis
7bf04be8f4 x86, asm: Cleanup unnecssary macros in asm-offsets.c
PAGE_SIZE_asm, PAGE_SHIFT_asm, THREAD_SIZE_asm can be safely removed from 
asm-offsets.c, and be replaced by their non-'_asm' counterparts in the code 
that uses them, since the _AC macro defined in include/linux/const.h makes
PAGE_SIZE/PAGE_SHIFT/THREAD_SIZE work with as.

Signed-off-by: Stratos Psomadakis <psomas@cslab.ece.ntua.gr>
LKML-Reference: <1298666774-17646-2-git-send-email-psomas@cslab.ece.ntua.gr>
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2011-02-25 16:37:32 -08:00
Linus Torvalds
958ede7f1b Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86 quirk: Fix polarity for IRQ0 pin2 override on SB800 systems
  x86/mrst: Fix apb timer rating when lapic timer is used
  x86: Fix reboot problem on VersaLogic Menlow boards
2011-02-25 14:02:33 -08:00