linux/arch/x86/mm
Tang Chen e8d1955258 acpi, memory-hotplug: parse SRAT before memblock is ready
On linux, the pages used by kernel could not be migrated.  As a result,
if a memory range is used by kernel, it cannot be hot-removed.  So if we
want to hot-remove memory, we should prevent kernel from using it.

The way now used to prevent this is specify a memory range by
movablemem_map boot option and set it as ZONE_MOVABLE.

But when the system is booting, memblock will allocate memory, and
reserve the memory for kernel.  And before we parse SRAT, and know the
node memory ranges, memblock is working.  And it may allocate memory in
ranges to be set as ZONE_MOVABLE.  This memory can be used by kernel,
and never be freed.

So, let's parse SRAT before memblock is called first.  And it is early
enough.

The first call of memblock_find_in_range_node() is in:

  setup_arch()
    |-->setup_real_mode()

so, this patch add a function early_parse_srat() to parse SRAT, and call
it before setup_real_mode() is called.

NOTE:

1) early_parse_srat() is called before numa_init(), and has initialized
   numa_meminfo.  So DO NOT clear numa_nodes_parsed in numa_init() and DO
   NOT zero numa_meminfo in numa_init(), otherwise we will lose memory
   numa info.

2) I don't know why using count of memory affinities parsed from SRAT
   as a return value in original acpi_numa_init().  So I add a static
   variable srat_mem_cnt to remember this count and use it as the return
   value of the new acpi_numa_init()

[mhocko@suse.cz: parse SRAT before memblock is ready fix]
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Jianguo Wu <wujianguo@huawei.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Wu Jianguo <wujianguo@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Len Brown <lenb@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-02-23 17:50:14 -08:00
..
kmemcheck bug.h: add include of it to various implicit C users 2012-02-29 17:15:08 -05:00
amdtopology.c x86, NUMA: Enable CONFIG_AMD_NUMA on 32bit too 2011-05-02 17:24:48 +02:00
dump_pagetables.c x86, mm: Create symbolic index into address_markers array 2010-07-20 16:56:19 -07:00
extable.c x86, extable: Switch to relative exception table entries 2012-04-20 17:22:34 -07:00
fault.c x86: Do not leak kernel page mapping locations 2013-02-07 19:57:44 +01:00
gup.c thp: add compound tail page _mapcount when mapped 2011-12-09 07:50:28 -08:00
highmem_32.c highmem: kill all __kmap_atomic() 2012-03-20 21:48:30 +08:00
hugetlbpage.c mm: use vm_unmapped_area() in hugetlbfs on i386 architecture 2012-12-11 17:22:25 -08:00
init_32.c memory-hotplug: introduce new arch_remove_memory() for removing page table 2013-02-23 17:50:12 -08:00
init_64.c memory-hotplug: remove memmap of sparse-vmemmap 2013-02-23 17:50:12 -08:00
init.c x86, kexec, 64bit: Only set ident mapping for ram. 2013-01-29 15:26:35 -08:00
iomap_32.c mm: fix race in kunmap_atomic() 2010-10-27 18:03:05 -07:00
ioremap.c Revert "x86, mm: Include the entire kernel memory map in trampoline_pgd" 2012-12-15 12:29:54 -08:00
kmmio.c x86, kmmio/mmiotrace: Fix double free of kmmio_fault_pages 2010-06-18 11:30:09 +02:00
Makefile memblock, x86: Replace memblock_x86_reserve/free_range() with generic ones 2011-07-14 11:47:53 -07:00
memtest.c x86/memtest: Shorten time for tests 2013-02-18 09:28:42 +01:00
mm_internal.h x86, mm: Move after_bootmem to mm_internel.h 2012-11-17 11:59:45 -08:00
mmap.c x86: Fix mmap random address range 2011-12-05 17:07:23 +01:00
mmio-mod.c module_param: make bool parameters really bool (arch) 2012-01-13 09:32:18 +10:30
numa_32.c x86-32, mm: Rip out x86_32 NUMA remapping code 2013-01-31 14:12:30 -08:00
numa_64.c x86, mm: kill numa_free_all_bootmem() 2012-11-17 11:59:47 -08:00
numa_emulation.c x86: print physical addresses consistently with other parts of kernel 2012-05-29 16:22:21 -07:00
numa_internal.h x86-32, mm: Rip out x86_32 NUMA remapping code 2013-01-31 14:12:30 -08:00
numa.c acpi, memory-hotplug: parse SRAT before memblock is ready 2013-02-23 17:50:14 -08:00
pageattr-test.c x86: Convert vmalloc()+memset() to vzalloc() 2011-05-28 19:53:57 +02:00
pageattr.c memory-hotplug: common APIs to support page tables hot-remove 2013-02-23 17:50:12 -08:00
pat_internal.h x86, pat: Fix memory leak in free_memtype 2010-05-26 11:26:04 -07:00
pat_rbtree.c rbtree: move augmented rbtree functionality to rbtree_augmented.h 2012-10-09 16:22:40 +09:00
pat.c x86, mm: Make DEBUG_VIRTUAL work earlier in boot 2013-01-25 16:33:22 -08:00
pf_in.c x86: Eliminate various 'set but not used' warnings 2011-05-21 19:10:33 +02:00
pf_in.h
pgtable_32.c Disintegrate asm/system.h for X86 2012-03-28 18:11:12 +01:00
pgtable.c Linux 3.8-rc5 2013-01-25 16:31:21 -08:00
physaddr.c x86, mm: Make DEBUG_VIRTUAL work earlier in boot 2013-01-25 16:33:22 -08:00
physaddr.h x86: split __phys_addr out into separate file 2009-09-10 11:48:55 -07:00
setup_nx.c x86, cpu: Only CPU features determine NX capabilities 2010-11-10 15:43:15 -08:00
srat.c x86/srat: Simplify memory affinity init error handling 2013-01-24 14:20:00 +01:00
testmmiotrace.c x86, kmmio/mmiotrace: Fix double free of kmmio_fault_pages 2010-06-18 11:30:09 +02:00
tlb.c x86: Convert a few mistaken __cpuinit annotations to __init 2013-01-24 17:12:19 +01:00