5969 Commits

Author SHA1 Message Date
Tejun Heo
c8f3329a0d x86: use static _cpu_pda array
_cpu_pda array first uses statically allocated storage in data.init
and then switches to allocated bootmem to conserve space.  However,
after folding pda area into percpu area, _cpu_pda array will be
removed completely.  Drop the reallocation part to simplify the code
for soon-to-follow changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:40 +01:00
Tejun Heo
f32ff5388d x86: load pointer to pda into %gs while brining up a CPU
[ Based on original patch from Christoph Lameter and Mike Travis. ]

CPU startup code in head_64.S loaded address of a zero page into %gs
for temporary use till pda is loaded but address to the actual pda is
available at the point.  Load the real address directly instead.

This will help unifying percpu and pda handling later on.

This patch is mostly taken from Mike Travis' "x86_64: Fold pda into
per cpu area" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
2009-01-16 14:19:26 +01:00
Tejun Heo
3e5d8f9784 x86: make percpu symbols zerobased on SMP
[ Based on original patch from Christoph Lameter and Mike Travis. ]

This patch makes percpu symbols zerobased on x86_64 SMP by adding
PERCPU_VADDR() to vmlinux.lds.h which helps setting explicit vaddr on
the percpu output section and using it in vmlinux_64.lds.S.  A new
PHDR is added as existing ones cannot contain sections near address
zero.  PERCPU_VADDR() also adds a new symbol __per_cpu_load which
always points to the vaddr of the loaded percpu data.init region.

The following adjustments have been made to accomodate the address
change.

* code to locate percpu gdt_page in head_64.S is updated to add the
  load address to the gdt_page offset.

* __per_cpu_load is used in places where access to the init data area
  is necessary.

* pda->data_offset is initialized soon after C code is entered as zero
  value doesn't work anymore.

This patch is mostly taken from Mike Travis' "x86_64: Base percpu
variables at zero" patch.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:14 +01:00
Tejun Heo
a698c823e1 x86: make vmlinux_32.lds.S use PERCPU() macro
Make vmlinux_32.lds.S use the generic PERCPU() macro instead of open
coding it.  This will ease future changes.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:19:09 +01:00
Mike Travis
c90aa894f0 x86: cleanup early setup_percpu references
[ Based on original patch from Christoph Lameter and Mike Travis. ]

  * Ruggedize some calls in setup_percpu.c to prevent mishaps
    in early calls, particularly for non-critical functions.

  * Cleanup DEBUG_PER_CPU_MAPS usages and some comments.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:18:23 +01:00
Tejun Heo
f10fcd4712 x86: make early_per_cpu() a lvalue and use it
Make early_per_cpu() a lvalue as per_cpu() is and use it where
applicable.

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:18:17 +01:00
Tejun Heo
7de6883faa x86: fix pda_to_op()
There's no instruction to move a 64bit immediate into memory location.
Drop "i".

Signed-off-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 14:18:11 +01:00
Jan Beulich
a3c6018e56 x86: fix assumed to be contiguous leaf page tables for kmap_atomic region (take 2)
Debugging and original patch from Nick Piggin <npiggin@suse.de>

The early fixmap pmd entry inserted at the very top of the KVA is causing the
subsequent fixmap mapping code to not provide physically linear pte pages over
the kmap atomic portion of the fixmap (which relies on said property to
calculate pte addresses).

This has caused weird boot failures in kmap_atomic much later in the boot
process (initial userspace faults) on a 32-bit PAE system with a larger number
of CPUs (smaller CPU counts tend not to run over into the next page so don't
show up the problem).

Solve this by attempting to clear out the page table, and copy any of its
entries to the new one. Also, add a bug if a nonlinear condition is encountered
and can't be resolved, which might save some hours of debugging if this fragile
scheme ever breaks again...

Once we have such logic, we can also use it to eliminate the early ioremap
trickery around the page table setup for the fixmap area. This also fixes
potential issues with FIX_* entries sharing the leaf page table with the early
ioremap ones getting discarded by early_ioremap_clear() and not restored by
early_ioremap_reset(). It at once eliminates the temporary (and configuration,
namely NR_CPUS, dependent) unavailability of early fixed mappings during the
time the fixmap area page tables get constructed.

Finally, also replace the hard coded calculation of the initial table space
needed for the fixmap area with a proper one, allowing kernels configured for
large CPU counts to actually boot.

Based-on: Nick Piggin <npiggin@suse.de>
Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-16 13:47:04 +01:00
Linus Torvalds
b5db0e3865 Revert "x86 PAT: remove CPA WARN_ON for zero pte"
This reverts commit 58dab916dfb57328d50deb0aa9b3fc92efa248ff, which
makes my Nehalem come to a nasty crawling almost-halt.  It looks like it
turns off caching of regular kernel RAM, with the understandable
slowdown of a few orders of magnitude as a result.

Acked-by: Ingo Molnar <mingo@elte.hu>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Peter Anvin <hpa@zytor.com>
Cc: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Cc: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-15 16:25:09 -08:00
Cliff Wickman
18c07cf530 x86, UV: cpu_relax in uv_wait_completion
The function uv_wait_completion() spins on reads of a memory-mapped
register, waiting for completion of BAU hardware replies.

It should call "cpu_relax()" between those reads to improve performance
on hyperthreaded configurations.

Signed-off-by: Cliff Wickman <cpw@sgi.com>
Acked-by: Jack Steiner <steiner@sgi.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 23:48:20 +01:00
Jan Beulich
4a13ad0bd8 x86: avoid early crash in disable_local_APIC()
E.g. when called due to an early panic.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 23:48:19 +01:00
Ingo Molnar
c847a9c713 Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/travis/linux-2.6-cpus4096-for-ingo into cpus4096 2009-01-15 18:37:07 +01:00
Mike Travis
f2a0827119 x86: fix build warning when CONFIG_NUMA not defined.
Impact: fix build warning

The macro cpu_to_node did not reference it's argument, and instead
simply returned a 0.  This causes a "unused variable" warning if
it's the only reference in a function (show_cache_disable).

Replace it with the more correct inline function.

Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-15 09:19:32 -08:00
Ingo Molnar
5cd7376200 fix: crash: IP: __bitmap_intersects+0x48/0x73
-tip testing found this crash:

> [   35.258515] calling  acpi_cpufreq_init+0x0/0x127 @ 1
> [   35.264127] BUG: unable to handle kernel NULL pointer dereference at (null)
> [   35.267554] IP: [<ffffffff80478092>] __bitmap_intersects+0x48/0x73
> [   35.267554] PGD 0
> [   35.267554] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC

arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c is still broken: there's no
allocation of the variable mask, so we pass in an uninitialized cmd.mask
field to drv_read(), which then passes it to the scheduler which then
crashes ...

Switch it over to the much simpler constant-cpumask-pointers approach.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 15:46:12 +01:00
Ingo Molnar
49a93bc978 Merge branch 'linus' into cpus4096 2009-01-15 15:45:31 +01:00
Ingo Molnar
7f268f4352 Merge branches 'cpus4096', 'x86/cleanups' and 'x86/urgent' into x86/percpu 2009-01-15 13:18:57 +01:00
Ingo Molnar
54da5b3d44 x86: fix broken flush_tlb_others_ipi(), fix
Impact: cleanup

Use the proper type.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 13:04:58 +01:00
Jan Beulich
a08c4743ed x86: avoid early crash in disable_local_APIC()
E.g. when called due to an early panic.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 12:04:40 +01:00
Jan Beulich
f11826385b x86: fully honor "nolapic"
Impact: widen the effect of the 'nolapic' boot parameter

"nolapic" should not only suppress SMP and use of the LAPIC, but it
also ought to have the effect of disabling all IO-APIC related activity
as well as PCI MSI and HT-IRQs.

Signed-off-by: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-15 12:04:40 +01:00
Linus Torvalds
bca268565f Merge branch 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6
* 'syscalls' of git://git390.osdl.marist.edu/pub/scm/linux-2.6: (44 commits)
  [CVE-2009-0029] s390 specific system call wrappers
  [CVE-2009-0029] System call wrappers part 33
  [CVE-2009-0029] System call wrappers part 32
  [CVE-2009-0029] System call wrappers part 31
  [CVE-2009-0029] System call wrappers part 30
  [CVE-2009-0029] System call wrappers part 29
  [CVE-2009-0029] System call wrappers part 28
  [CVE-2009-0029] System call wrappers part 27
  [CVE-2009-0029] System call wrappers part 26
  [CVE-2009-0029] System call wrappers part 25
  [CVE-2009-0029] System call wrappers part 24
  [CVE-2009-0029] System call wrappers part 23
  [CVE-2009-0029] System call wrappers part 22
  [CVE-2009-0029] System call wrappers part 21
  [CVE-2009-0029] System call wrappers part 20
  [CVE-2009-0029] System call wrappers part 19
  [CVE-2009-0029] System call wrappers part 18
  [CVE-2009-0029] System call wrappers part 17
  [CVE-2009-0029] System call wrappers part 16
  [CVE-2009-0029] System call wrappers part 15
  ...
2009-01-14 19:58:40 -08:00
Harvey Harrison
74d96f0186 byteorder: make swab.h include asm/swab.h like a regular header
Add swab.h to kbuild.asm and remove the individual entries from
each arch, mark as unifdef as some arches have some kernel-only
bits inside.

Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-01-14 19:56:50 -08:00
Sam Ravnborg
2ea038917b Revert "kbuild: strip generated symbols from *.ko"
This reverts commit ad7a953c522ceb496611d127e51e278bfe0ff483.

And commit: ("allow stripping of generated symbols under CONFIG_KALLSYMS_ALL")
            9bb482476c6c9d1ae033306440c51ceac93ea80c

These stripping patches has caused a set of issues:

1) People have reported compatibility issues with binutils due to
   lack of support for `--strip-unneeded-symbols' with objcopy 2.15.92.0.2
   Reported by: Wenji
2) ccache and distcc no longer works as expeced
   Reported by: Ted, Roland, + others
3) The installed modules increased a lot in size
   Reported by: Ted, Davej + others

Reported-by: Wenji Huang <wenji.huang@oracle.com>
Reported-by: "Theodore Ts'o" <tytso@mit.edu>
Reported-by: Dave Jones <davej@redhat.com>
Reported-by: Roland McGrath <roland@redhat.com>
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
2009-01-14 21:38:20 +01:00
Suresh Siddha
5cca0cf15a x86, pat: fix reserve_memtype() for legacy 1MB range
Thierry Vignaud reported:
> http://bugzilla.kernel.org/show_bug.cgi?id=12372
>
> On P4 with an SiS motherboard (video card is a SiS 651)
> X server fails to start with error:
> xf86MapVidMem: Could not mmap framebuffer (0x00000000,0x2000) (Invalid
> argument)

Here X is trying to map first 8KB of memory using /dev/mem. Existing
code treats first 0-4KB of memory as non-RAM and 4KB-8KB as RAM. Recent
code changes don't allow to map memory with different attributes
at the same time.

Fix this by treating the first 1MB legacy region as special and always
track the attribute requests with in this region using linear linked
list (and don't bother if the range is RAM or non-RAM or mixed)

Reported-and-tested-by: Thierry Vignaud <tvignaud@mandriva.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 20:14:45 +01:00
Heiko Carstens
e55380edf6 [CVE-2009-0029] Rename old_readdir to sys_old_readdir
This way it matches the generic system call name convention.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2009-01-14 14:15:15 +01:00
Yinghai Lu
b665967979 x86: make 32bit MAX_HARDIRQS_PER_CPU to be NR_VECTORS
Impact: clean up to be same as 64bit

32-bit is using per-cpu vector too, so don't use default NR_IRQS.

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 12:17:02 +01:00
Ingo Molnar
e46d51787e Merge branch 'master' of ssh://master.kernel.org/pub/scm/linux/kernel/git/travis/linux-2.6-cpus4096-for-ingo into cpus4096 2009-01-14 12:13:45 +01:00
Ingo Molnar
0a2a18b721 x86: change the default cache size to 64 bytes
Right now the generic cacheline size is 128 bytes - that is wasteful
when structures are aligned, as all modern x86 CPUs have an (effective)
cacheline sizes of 64 bytes.

It was set to 128 bytes due to some cacheline aliasing problems on
older P4 systems, but those are many years old and we dont optimize
for them anymore. (They'll still get the 128 bytes cacheline size if
the kernel is specifically built for Pentium 4)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Arjan van de Ven <arjan@linux.intel.com>
2009-01-14 12:04:56 +01:00
Frederik Deweerdt
09b3ec7315 x86, tlb flush_data: replace per_cpu with an array
Impact: micro-optimization, memory reduction

On x86_64 flush tlb data is stored in per_cpu variables. This is
unnecessary because only the first NUM_INVALIDATE_TLB_VECTORS entries
are accessed.

This patch aims at making the code less confusing (there's nothing
really "per_cpu") by using a plain array. It also would save some memory
on most distros out there (Ubuntu x86_64 has NR_CPUS=64 by default).

[ Ravikiran G Thirumalai also pointed out that the correct alignment
  is ____cacheline_internodealigned_in_smp, so that there's no
  bouncing on vsmp. ]

Signed-off-by: Frederik Deweerdt <frederik.deweerdt@xprog.eu>
Acked-by: Ravikiran Thirumalai <kiran@scalex86.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 12:04:53 +01:00
Jaswinder Singh Rajput
c2c21745ec x86: replacing mp_config_intsrc with mpc_intsrc
Impact: cleanup, solve 80 columns wrap problems

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 11:58:35 +01:00
Jaswinder Singh Rajput
b5ba7e6d1e x86: replacing mp_config_ioapic with mpc_ioapic
Impact: cleanup, solve 80 columns wrap problems

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 11:58:27 +01:00
Suresh Siddha
a4a0acf8e1 x86: fix broken flush_tlb_others_ipi()
This commit broke flush_tlb_others_ipi() causing boot hangs on a
16 logical cpu system:

>	commit 4595f9620cda8a1e973588e743cf5f8436dd20c6
>	Author: Rusty Russell <rusty@rustcorp.com.au>
>	Date:   Sat Jan 10 21:58:09 2009 -0800
>
>	    x86: change flush_tlb_others to take a const struct cpumask

This change resulted in sending the invalidate tlb vector to the
sender itself causing the hang. flush_tlb_others_ipi() should exclude
the sender itself from the destination list.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-14 08:51:06 +01:00
venkatesh.pallipadi@intel.com
58dab916df x86 PAT: remove CPA WARN_ON for zero pte
Impact: reduce scope of debug check - avoid warnings

The logic to find whether identity map exists or not using
high_memory or max_low_pfn_mapped/max_pfn_mapped are not complete
as the memory withing the range may not be mapped if there is a
unusable hole in e820.

Specifically, on my test system I started seeing these warnings with
tools like hwinfo, acpidump trying to map ACPI region.

[   27.400018] ------------[ cut here ]------------
[   27.400344] WARNING: at /home/venkip/src/linus/linux-2.6/arch/x86/mm/pageattr.c:560 __change_page_attr_set_clr+0xf3/0x8b8()
[   27.400821] Hardware name: X7DB8
[   27.401070] CPA: called for zero pte. vaddr = ffff8800cff6a000 cpa->vaddr = ffff8800cff6a000
[   27.401569] Modules linked in:
[   27.401882] Pid: 4913, comm: dmidecode Not tainted 2.6.28-05716-gfe0bdec #586
[   27.402141] Call Trace:
[   27.402488]  [<ffffffff80237c21>] warn_slowpath+0xd3/0x10f
[   27.402749]  [<ffffffff80274ade>] ? find_get_page+0xb3/0xc9
[   27.403028]  [<ffffffff80274a2b>] ? find_get_page+0x0/0xc9
[   27.403333]  [<ffffffff80226425>] __change_page_attr_set_clr+0xf3/0x8b8
[   27.403628]  [<ffffffff8028ec99>] ? __purge_vmap_area_lazy+0x192/0x1a1
[   27.403883]  [<ffffffff8028eb52>] ? __purge_vmap_area_lazy+0x4b/0x1a1
[   27.404172]  [<ffffffff80290268>] ? vm_unmap_aliases+0x1ab/0x1bb
[   27.404512]  [<ffffffff80290105>] ? vm_unmap_aliases+0x48/0x1bb
[   27.404766]  [<ffffffff80226d28>] change_page_attr_set_clr+0x13e/0x2e6
[   27.405026]  [<ffffffff80698fa7>] ? _spin_unlock+0x26/0x2a
[   27.405292]  [<ffffffff80227e6a>] ? reserve_memtype+0x19b/0x4e3
[   27.405590]  [<ffffffff80226ffd>] _set_memory_wb+0x22/0x24
[   27.405844]  [<ffffffff80225d28>] ioremap_change_attr+0x26/0x28
[   27.406097]  [<ffffffff80228355>] reserve_pfn_range+0x1a3/0x235
[   27.406427]  [<ffffffff80228430>] track_pfn_vma_new+0x49/0xb3
[   27.406686]  [<ffffffff80286c46>] remap_pfn_range+0x94/0x32c
[   27.406940]  [<ffffffff8022878d>] ? phys_mem_access_prot_allowed+0xb5/0x1a8
[   27.407209]  [<ffffffff803e9bf4>] mmap_mem+0x75/0x9d
[   27.407523]  [<ffffffff8028b3b4>] mmap_region+0x2cf/0x53e
[   27.407776]  [<ffffffff8028b8cc>] do_mmap_pgoff+0x2a9/0x30d
[   27.408034]  [<ffffffff8020f4a4>] sys_mmap+0x92/0xce
[   27.408339]  [<ffffffff8020b65b>] system_call_fastpath+0x16/0x1b
[   27.408614] ---[ end trace 4b16ad70c09a602d ]---
[   27.408871] dmidecode:4913 reserve_pfn_range ioremap_change_attr failed write-back for cff6a000-cff6b000

This is wih track_pfn_vma_new trying to keep identity map in sync.
The address cff6a000 is the ACPI region according to e820.

[    0.000000] BIOS-provided physical RAM map:
[    0.000000]  BIOS-e820: 0000000000000000 - 000000000009c000 (usable)
[    0.000000]  BIOS-e820: 000000000009c000 - 00000000000a0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000cc000 - 00000000000d0000 (reserved)
[    0.000000]  BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
[    0.000000]  BIOS-e820: 0000000000100000 - 00000000cff60000 (usable)
[    0.000000]  BIOS-e820: 00000000cff60000 - 00000000cff69000 (ACPI data)
[    0.000000]  BIOS-e820: 00000000cff69000 - 00000000cff80000 (ACPI NVS)
[    0.000000]  BIOS-e820: 00000000cff80000 - 00000000d0000000 (reserved)
[    0.000000]  BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved)
[    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
[    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
[    0.000000]  BIOS-e820: 00000000ff000000 - 0000000100000000 (reserved)
[    0.000000]  BIOS-e820: 0000000100000000 - 0000000230000000 (usable)

And is not mapped as per init_memory_mapping.

[    0.000000] init_memory_mapping: 0000000000000000-00000000cff60000
[    0.000000] init_memory_mapping: 0000000100000000-0000000230000000

We can add logic to check for this. But, there can also be other holes in
identity map when we have 1GB of aligned reserved space in e820.

This patch handles it by removing the WARN_ON and returning a specific
error value (EFAULT) to indicate that the address does not have any
identity mapping.

The code that tries to keep identity map in sync can ignore
this error, with other callers of cpa still getting error here.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-13 19:13:02 +01:00
venkatesh.pallipadi@intel.com
cdecff6864 x86 PAT: return compatible mapping to remap_pfn_range callers
Impact: avoid warning message, potentially solve 3D performance regression

Change x86 PAT code to return compatible memtype if the exact memtype that
was requested in remap_pfn_rage and friends is not available due to some
conflict.

This is done by returning the compatible type in pgprot parameter of
track_pfn_vma_new(), and the caller uses that memtype for page table.

Note that track_pfn_vma_copy() which is basically called during fork gets the
prot from existing page table and should not have any conflict. Hence we use
strict memtype check there and do not allow compatible memtypes.

This patch fixes the bug reported here:

  http://marc.info/?l=linux-kernel&m=123108883716357&w=2

Specifically the error message:

  X:5010 map pfn expected mapping type write-back for d0000000-d0101000,
  got write-combining

Should go away.

Reported-and-bisected-by: Kevin Winchester <kjwinchester@gmail.com>
Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-13 19:13:02 +01:00
venkatesh.pallipadi@intel.com
e4b866ed19 x86 PAT: change track_pfn_vma_new to take pgprot_t pointer param
Impact: cleanup

Change the protection parameter for track_pfn_vma_new() into a pgprot_t pointer.
Subsequent patch changes the x86 PAT handling to return a compatible
memtype in pgprot_t, if what was requested cannot be allowed due to conflicts.
No fuctionality change in this patch.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-13 19:13:01 +01:00
venkatesh.pallipadi@intel.com
afc7d20c84 x86 PAT: consolidate old memtype new memtype check into a function
Impact: cleanup

Move the new memtype old memtype allowed check to header so that is can be
shared by other users. Subsequent patch uses this in pat.c in remap_pfn_range()
code path. No functionality change in this patch.

Signed-off-by: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-13 19:13:00 +01:00
Andi Kleen
c8399943bd x86, generic: mark complex bitops.h inlines as __always_inline
Impact: reduce kernel image size

Hugh Dickins noticed that older gcc versions when the kernel
is built for code size didn't inline some of the bitops.

Mark all complex x86 bitops that have more than a single
asm statement or two as always inline to avoid this problem.

Probably should be done for other architectures too.

Ingo then found a better fix that only requires
a single line change, but it unfortunately only
works on gcc 4.3.

On older gccs the original patch still makes a ~0.3% defconfig
difference with CONFIG_OPTIMIZE_INLINING=y.

With gcc 4.1 and a defconfig like build:

    6116998 1138540  883788 8139326  7c323e vmlinux-oi-with-patch
    6137043 1138540  883788 8159371  7c808b vmlinux-optimize-inlining

~20k / 0.3% difference.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-13 18:56:30 +01:00
Ingo Molnar
4a922a969c x86, cpufreq: remove leftover copymask_copy()
Impact: fix potential boot crash on MAXSMP

Remove code left over by:

  50c668d: Revert "cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read

That cmd.cpumask is not allocated anymore. No impact on default !MAXSMP
kernels.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-13 16:11:00 +01:00
Yinghai Lu
4a046d1754 x86: arch_probe_nr_irqs
Impact: save RAM with large NR_CPUS, get smaller nr_irqs

Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-12 17:39:24 -08:00
Ingo Molnar
e8cea892df Revert "i386: add TRACE_IRQS_OFF for the nmi"
This reverts commit e0c7317557c8fc8eacf611e30c2a80f4e24e47a3.

This patch was wrong, as lockdep (and thus the irq state tracer)
aren't nmi safe. People are already seeing lockdep warnings due
to this.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 19:36:59 +01:00
Ingo Molnar
50c668d678 Revert "cpumask: use work_on_cpu in acpi-cpufreq.c for drv_read and drv_write"
This reverts commit 7503bfbae89eba07b46441a5d1594647f6b8ab7d.

Dieter Ries reported bootup soft-hangs and bisected it back to
this commit, and reverting this commit gave him a working system.

The commit introduces work_on_cpu() use into the cpufreq code,
but that is subtly problematic from a lock hierarchy POV: the
hotplug-cpu lock is an highlevel lock that is taken before
lowlevel locks, and in this codepath we are called with the
policy lock taken.

Dieter did not have lockdep enabled so we dont have a nice stack
trace proof for this, but using work_on_cpu() in such a lowlevel
place certainly looks wrong, so we revert the patch.

work_on_cpu() needs to be reworked to be more generally usable.

Reported-by: Dieter Ries <clip2@gmx.de>
Tested-by: Dieter Ries <clip2@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 19:24:23 +01:00
Jaswinder Singh Rajput
2bc1379712 x86: fix apic.c build error on latest git
Fix this by reintroducing asm/smp.h include in apic.c - later on
I will fix this by removing non-smp data from smp.h

Also fix the __inquire_remote_apic() prototype/inline.

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 19:24:23 +01:00
Jaswinder Singh Rajput
4884d8e6a0 x86: fix mpparse.c build error on latest git
Fix this by reintroducing asm/smp.h include in mpparse.c - later on
I will fix this by removing non-smp data from smp.h.

Reported-by: Petr Titera <P.Titera@century.cz>
Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 19:24:22 +01:00
Andi Kleen
f313e12308 x86: avoid theoretical vmalloc fault loop
Ajith Kumar noticed:

 I was going through the vmalloc fault handling for x86_64 and am unclear
 about the following lines in the vmalloc_fault() function.

 pgd = pgd_offset(current->mm ?: &init_mm, address);
 pgd_ref = pgd_offset_k(address);

 Here the intention is to get the pgd corresponding to the current process
 and sync it up with the pgd in init_mm(obtained from pgd_offset_k).
 However, for kernel threads current->mm is NULL and hence pgd =
 pgd_offset(init_mm, address) = pgd_ref which means the fault handler
 returns without setting the pgd entry in the MM structure in the context
 of which the kernel thread has faulted.  This could lead to never-ending
 faults and busy looping of kernel threads like pdflush.  So, shouldn't the
 pgd = pgd_offset(current->mm ?: &init_mm, address); be pgd =
 pgd_offset(current->active_mm ?: &init_mm, address);

We can use active_mm unconditionally because it should be always set.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 19:24:21 +01:00
Peter Zijlstra
6d612b0f94 locking, hpet: annotate false positive warning
Alexander Beregalov reported that this warning is caused by the HPET code:

> hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0
> hpet0: 3 comparators, 64-bit 14.318180 MHz counter
> ODEBUG: object is on stack, but not annotated
> ------------[ cut here ]------------
> WARNING: at lib/debugobjects.c:251 __debug_object_init+0x2a4/0x352()

> Bisected down to 26afe5f2fbf06ea0765aaa316640c4dd472310c0
> (x86: HPET_MSI Initialise per-cpu HPET timers)

The commit is fine - but the on-stack workqueue entry needs annotation.

Reported-and-bisected-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Tested-by: Alexander Beregalov <a.beregalov@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 13:33:20 +01:00
Jaswinder Singh Rajput
3b9dc9f2f1 x86: module_64.c fix style problems
Impact: cleanup

Fix:

 ERROR: trailing whitespace
 ERROR: code indent should use tabs where possible
 WARNING: %Ld/%Lu are not-standard C, use %lld/%llu
 WARNING: printk() should include KERN_ facility level
 ERROR: spaces required around that '=' (ctx:VxW)

 total: 13 errors, 2 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 11:23:01 +01:00
Jaswinder Singh Rajput
e17029ad69 x86: module_32.c fix style problems
Impact: cleanup

Fix:

 ERROR: code indent should use tabs where possible
 ERROR: trailing whitespace
 ERROR: spaces required around that '=' (ctx:VxW)

 total: 3 errors, 0 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 11:22:55 +01:00
Jaswinder Singh Rajput
448dd2fa3e x86: msr.c fix style problems
Impact: cleanup

Fix:

 WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>

 total: 0 errors, 1 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 11:22:50 +01:00
Jaswinder Singh Rajput
dd3feda774 x86: microcode_intel.c fix style problems
Impact: cleanup

Fix:

 WARNING: Use #include <linux/uaccess.h> instead of <asm/uaccess.h>
 ERROR: trailing whitespace
 ERROR: "(foo*)" should be "(foo *)"

 total: 3 errors, 1 warnings

Signed-off-by: Jaswinder Singh Rajput <jaswinderrajput@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-12 11:22:40 +01:00
Mike Travis
92296c6d6e cpumask, irq: non-x86 build failures
Ingo Molnar wrote:

> All non-x86 architectures fail to build:
>
> In file included from /home/mingo/tip/include/linux/random.h:11,
>                  from /home/mingo/tip/include/linux/stackprotector.h:6,
>                  from /home/mingo/tip/init/main.c:17:
> /home/mingo/tip/include/linux/irqnr.h:26:63: error: asm/irq_vectors.h: No such file or directory

Do not include asm/irq_vectors.h in generic code - it's not available
on all architectures.

Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-01-11 19:13:45 +01:00
Mike Travis
9332fccded irq: initialize nr_irqs based on nr_cpu_ids
Impact: Reduce memory usage.

This is the second half of the changes to make the irq_desc_ptrs be
variable sized based on nr_cpu_ids.  This is done by adding a new
"max_nr_irqs" macro to irq_vectors.h (and a dummy in irqnr.h) to
return a max NR_IRQS value based on NR_CPUS or nr_cpu_ids.

This necessitated moving the define of MAX_IO_APICS to a separate
file (asm/apicnum.h) so it could be included without the baggage
of the other asm/apicdef.h declarations.

Signed-off-by: Mike Travis <travis@sgi.com>
2009-01-11 19:13:38 +01:00