2005-04-16 15:20:36 -07:00
/*
2007-10-12 21:10:53 -04:00
* prepare to run common code
2005-04-16 15:20:36 -07:00
*
* Copyright ( C ) 2000 Andrea Arcangeli < andrea @ suse . de > SuSE
*/
2017-03-13 19:33:37 +03:00
# define DISABLE_BRANCH_PROFILING
2005-04-16 15:20:36 -07:00
# include <linux/init.h>
# include <linux/linkage.h>
# include <linux/types.h>
# include <linux/kernel.h>
# include <linux/string.h>
# include <linux/percpu.h>
2008-01-30 13:30:19 +01:00
# include <linux/start_kernel.h>
2008-03-28 10:49:44 +08:00
# include <linux/io.h>
2010-08-25 13:39:17 -07:00
# include <linux/memblock.h>
2017-07-17 16:10:05 -05:00
# include <linux/mem_encrypt.h>
2005-04-16 15:20:36 -07:00
# include <asm/processor.h>
# include <asm/proto.h>
# include <asm/smp.h>
# include <asm/setup.h>
# include <asm/desc.h>
2005-11-05 17:25:53 +01:00
# include <asm/pgtable.h>
2007-05-02 19:27:07 +02:00
# include <asm/tlbflush.h>
2005-11-05 17:25:53 +01:00
# include <asm/sections.h>
2008-01-30 13:30:17 +01:00
# include <asm/kdebug.h>
2017-01-27 10:27:10 +01:00
# include <asm/e820/api.h>
2009-08-29 15:03:59 +02:00
# include <asm/bios_ebda.h>
2013-01-29 01:05:24 -08:00
# include <asm/bootparam_utils.h>
2012-12-20 23:44:30 -08:00
# include <asm/microcode.h>
2015-02-13 14:39:25 -08:00
# include <asm/kasan.h>
2005-04-16 15:20:36 -07:00
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
/*
* Manage page tables very early on .
*/
2017-06-06 14:31:27 +03:00
extern pgd_t early_top_pgt [ PTRS_PER_PGD ] ;
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
extern pmd_t early_dynamic_pgts [ EARLY_DYNAMIC_PAGE_TABLES ] [ PTRS_PER_PMD ] ;
2017-06-06 14:31:26 +03:00
static unsigned int __initdata next_early_pgt ;
2013-05-20 11:36:03 -07:00
pmdval_t early_pmd_flags = __PAGE_KERNEL_LARGE & ~ ( _PAGE_GLOBAL | _PAGE_NX ) ;
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
2017-06-16 14:30:24 +03:00
# define __head __section(.head.text)
static void __head * fixup_pointer ( void * ptr , unsigned long physaddr )
2017-06-06 14:31:26 +03:00
{
return ptr - ( void * ) _text + ( void * ) physaddr ;
}
2017-07-17 16:10:05 -05:00
unsigned long __head __startup_64 ( unsigned long physaddr )
2017-06-06 14:31:26 +03:00
{
unsigned long load_delta , * p ;
2017-07-17 16:10:05 -05:00
unsigned long pgtable_flags ;
2017-06-06 14:31:26 +03:00
pgdval_t * pgd ;
2017-06-06 14:31:28 +03:00
p4dval_t * p4d ;
2017-06-06 14:31:26 +03:00
pudval_t * pud ;
pmdval_t * pmd , pmd_entry ;
int i ;
/* Is the address too large? */
if ( physaddr > > MAX_PHYSMEM_BITS )
for ( ; ; ) ;
/*
* Compute the delta between the address I am compiled to run at
* and the address I am actually running at .
*/
load_delta = physaddr - ( unsigned long ) ( _text - __START_KERNEL_map ) ;
/* Is the address not 2M aligned? */
if ( load_delta & ~ PMD_PAGE_MASK )
for ( ; ; ) ;
2017-07-17 16:10:05 -05:00
/* Activate Secure Memory Encryption (SME) if supported and enabled */
sme_enable ( ) ;
/* Include the SME encryption mask in the fixup value */
load_delta + = sme_get_me_mask ( ) ;
2017-06-06 14:31:26 +03:00
/* Fixup the physical addresses in the page table */
2017-06-06 14:31:27 +03:00
pgd = fixup_pointer ( & early_top_pgt , physaddr ) ;
2017-06-06 14:31:26 +03:00
pgd [ pgd_index ( __START_KERNEL_map ) ] + = load_delta ;
2017-06-06 14:31:28 +03:00
if ( IS_ENABLED ( CONFIG_X86_5LEVEL ) ) {
p4d = fixup_pointer ( & level4_kernel_pgt , physaddr ) ;
p4d [ 511 ] + = load_delta ;
}
2017-06-06 14:31:26 +03:00
pud = fixup_pointer ( & level3_kernel_pgt , physaddr ) ;
pud [ 510 ] + = load_delta ;
pud [ 511 ] + = load_delta ;
pmd = fixup_pointer ( level2_fixmap_pgt , physaddr ) ;
pmd [ 506 ] + = load_delta ;
/*
* Set up the identity mapping for the switchover . These
* entries should * NOT * have the global bit set ! This also
* creates a bunch of nonsense entries but that is fine - -
* it avoids problems around wraparound .
*/
pud = fixup_pointer ( early_dynamic_pgts [ next_early_pgt + + ] , physaddr ) ;
pmd = fixup_pointer ( early_dynamic_pgts [ next_early_pgt + + ] , physaddr ) ;
2017-07-17 16:10:07 -05:00
pgtable_flags = _KERNPG_TABLE_NOENC + sme_get_me_mask ( ) ;
2017-06-06 14:31:26 +03:00
2017-06-06 14:31:28 +03:00
if ( IS_ENABLED ( CONFIG_X86_5LEVEL ) ) {
p4d = fixup_pointer ( early_dynamic_pgts [ next_early_pgt + + ] , physaddr ) ;
i = ( physaddr > > PGDIR_SHIFT ) % PTRS_PER_PGD ;
2017-07-17 16:10:05 -05:00
pgd [ i + 0 ] = ( pgdval_t ) p4d + pgtable_flags ;
pgd [ i + 1 ] = ( pgdval_t ) p4d + pgtable_flags ;
2017-06-06 14:31:28 +03:00
i = ( physaddr > > P4D_SHIFT ) % PTRS_PER_P4D ;
2017-07-17 16:10:05 -05:00
p4d [ i + 0 ] = ( pgdval_t ) pud + pgtable_flags ;
p4d [ i + 1 ] = ( pgdval_t ) pud + pgtable_flags ;
2017-06-06 14:31:28 +03:00
} else {
i = ( physaddr > > PGDIR_SHIFT ) % PTRS_PER_PGD ;
2017-07-17 16:10:05 -05:00
pgd [ i + 0 ] = ( pgdval_t ) pud + pgtable_flags ;
pgd [ i + 1 ] = ( pgdval_t ) pud + pgtable_flags ;
2017-06-06 14:31:28 +03:00
}
2017-06-06 14:31:26 +03:00
i = ( physaddr > > PUD_SHIFT ) % PTRS_PER_PUD ;
2017-07-17 16:10:05 -05:00
pud [ i + 0 ] = ( pudval_t ) pmd + pgtable_flags ;
pud [ i + 1 ] = ( pudval_t ) pmd + pgtable_flags ;
2017-06-06 14:31:26 +03:00
pmd_entry = __PAGE_KERNEL_LARGE_EXEC & ~ _PAGE_GLOBAL ;
2017-07-17 16:10:05 -05:00
pmd_entry + = sme_get_me_mask ( ) ;
2017-06-06 14:31:26 +03:00
pmd_entry + = physaddr ;
for ( i = 0 ; i < DIV_ROUND_UP ( _end - _text , PMD_SIZE ) ; i + + ) {
int idx = i + ( physaddr > > PMD_SHIFT ) % PTRS_PER_PMD ;
pmd [ idx ] = pmd_entry + i * PMD_SIZE ;
}
/*
* Fixup the kernel text + data virtual addresses . Note that
* we might write invalid pmds , when the kernel is relocated
* cleanup_highmap ( ) fixes this up along with the mappings
* beyond _end .
*/
pmd = fixup_pointer ( level2_kernel_pgt , physaddr ) ;
for ( i = 0 ; i < PTRS_PER_PMD ; i + + ) {
if ( pmd [ i ] & _PAGE_PRESENT )
pmd [ i ] + = load_delta ;
}
2017-07-17 16:10:05 -05:00
/*
* Fixup phys_base - remove the memory encryption mask to obtain
* the true physical address .
*/
2017-06-06 14:31:26 +03:00
p = fixup_pointer ( & phys_base , physaddr ) ;
2017-07-17 16:10:05 -05:00
* p + = load_delta - sme_get_me_mask ( ) ;
/* Encrypt the kernel (if SME is active) */
sme_encrypt_kernel ( ) ;
/*
* Return the SME encryption mask ( if SME is active ) to be used as a
* modifier for the initial pgdir entry programmed into CR3 .
*/
return sme_get_me_mask ( ) ;
}
unsigned long __startup_secondary_64 ( void )
{
/*
* Return the SME encryption mask ( if SME is active ) to be used as a
* modifier for the initial pgdir entry programmed into CR3 .
*/
return sme_get_me_mask ( ) ;
2017-06-06 14:31:26 +03:00
}
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
/* Wipe all early page tables except for the kernel symbol map */
static void __init reset_early_page_tables ( void )
2007-05-02 19:27:07 +02:00
{
2017-06-06 14:31:27 +03:00
memset ( early_top_pgt , 0 , sizeof ( pgd_t ) * ( PTRS_PER_PGD - 1 ) ) ;
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
next_early_pgt = 0 ;
2017-07-17 16:10:07 -05:00
write_cr3 ( __sme_pa_nodebug ( early_top_pgt ) ) ;
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
}
/* Create a new PMD entry */
int __init early_make_pgtable ( unsigned long address )
{
unsigned long physaddr = address - __PAGE_OFFSET ;
pgdval_t pgd , * pgd_p ;
2017-06-06 14:31:28 +03:00
p4dval_t p4d , * p4d_p ;
2013-01-24 12:19:53 -08:00
pudval_t pud , * pud_p ;
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
pmdval_t pmd , * pmd_p ;
/* Invalid address or early pgt is done ? */
2017-06-06 14:31:27 +03:00
if ( physaddr > = MAXMEM | | read_cr3_pa ( ) ! = __pa_nodebug ( early_top_pgt ) )
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
return - 1 ;
2013-01-24 12:19:53 -08:00
again :
2017-06-06 14:31:27 +03:00
pgd_p = & early_top_pgt [ pgd_index ( address ) ] . pgd ;
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
pgd = * pgd_p ;
/*
* The use of __START_KERNEL_map rather than __PAGE_OFFSET here is
* critical - - __PAGE_OFFSET would point us back into the dynamic
* range and we might end up looping forever . . .
*/
2017-06-06 14:31:28 +03:00
if ( ! IS_ENABLED ( CONFIG_X86_5LEVEL ) )
p4d_p = pgd_p ;
else if ( pgd )
p4d_p = ( p4dval_t * ) ( ( pgd & PTE_PFN_MASK ) + __START_KERNEL_map - phys_base ) ;
else {
if ( next_early_pgt > = EARLY_DYNAMIC_PAGE_TABLES ) {
reset_early_page_tables ( ) ;
goto again ;
}
p4d_p = ( p4dval_t * ) early_dynamic_pgts [ next_early_pgt + + ] ;
memset ( p4d_p , 0 , sizeof ( * p4d_p ) * PTRS_PER_P4D ) ;
* pgd_p = ( pgdval_t ) p4d_p - __START_KERNEL_map + phys_base + _KERNPG_TABLE ;
}
p4d_p + = p4d_index ( address ) ;
p4d = * p4d_p ;
if ( p4d )
pud_p = ( pudval_t * ) ( ( p4d & PTE_PFN_MASK ) + __START_KERNEL_map - phys_base ) ;
2013-01-24 12:19:53 -08:00
else {
if ( next_early_pgt > = EARLY_DYNAMIC_PAGE_TABLES ) {
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
reset_early_page_tables ( ) ;
2013-01-24 12:19:53 -08:00
goto again ;
}
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
pud_p = ( pudval_t * ) early_dynamic_pgts [ next_early_pgt + + ] ;
2016-02-09 19:44:54 +06:00
memset ( pud_p , 0 , sizeof ( * pud_p ) * PTRS_PER_PUD ) ;
2017-06-06 14:31:28 +03:00
* p4d_p = ( p4dval_t ) pud_p - __START_KERNEL_map + phys_base + _KERNPG_TABLE ;
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
}
2013-01-24 12:19:53 -08:00
pud_p + = pud_index ( address ) ;
pud = * pud_p ;
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
2013-01-24 12:19:53 -08:00
if ( pud )
pmd_p = ( pmdval_t * ) ( ( pud & PTE_PFN_MASK ) + __START_KERNEL_map - phys_base ) ;
else {
if ( next_early_pgt > = EARLY_DYNAMIC_PAGE_TABLES ) {
reset_early_page_tables ( ) ;
goto again ;
}
pmd_p = ( pmdval_t * ) early_dynamic_pgts [ next_early_pgt + + ] ;
2016-02-09 19:44:54 +06:00
memset ( pmd_p , 0 , sizeof ( * pmd_p ) * PTRS_PER_PMD ) ;
2013-01-24 12:19:53 -08:00
* pud_p = ( pudval_t ) pmd_p - __START_KERNEL_map + phys_base + _KERNPG_TABLE ;
}
2013-05-02 10:33:46 -07:00
pmd = ( physaddr & PMD_MASK ) + early_pmd_flags ;
2013-01-24 12:19:53 -08:00
pmd_p [ pmd_index ( address ) ] = pmd ;
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
return 0 ;
2007-05-02 19:27:07 +02:00
}
2005-04-16 15:20:36 -07:00
/* Don't add a printk in there. printk relies on the PDA which is not initialized
yet . */
static void __init clear_bss ( void )
{
memset ( __bss_start , 0 ,
2005-11-05 17:25:53 +01:00
( unsigned long ) __bss_stop - ( unsigned long ) __bss_start ) ;
2005-04-16 15:20:36 -07:00
}
2013-01-24 12:19:57 -08:00
static unsigned long get_cmd_line_ptr ( void )
{
unsigned long cmd_line_ptr = boot_params . hdr . cmd_line_ptr ;
2013-01-28 20:16:44 -08:00
cmd_line_ptr | = ( u64 ) boot_params . ext_cmd_line_ptr < < 32 ;
2013-01-24 12:19:57 -08:00
return cmd_line_ptr ;
}
2005-04-16 15:20:36 -07:00
static void __init copy_bootdata ( char * real_mode_data )
{
char * command_line ;
2013-01-24 12:19:57 -08:00
unsigned long cmd_line_ptr ;
2005-04-16 15:20:36 -07:00
2007-10-15 17:13:22 -07:00
memcpy ( & boot_params , real_mode_data , sizeof boot_params ) ;
2013-01-29 01:05:24 -08:00
sanitize_boot_params ( & boot_params ) ;
2013-01-24 12:19:57 -08:00
cmd_line_ptr = get_cmd_line_ptr ( ) ;
if ( cmd_line_ptr ) {
command_line = __va ( cmd_line_ptr ) ;
2007-10-15 17:13:22 -07:00
memcpy ( boot_command_line , command_line , COMMAND_LINE_SIZE ) ;
2005-04-16 15:20:36 -07:00
}
}
2014-05-02 00:44:37 +02:00
asmlinkage __visible void __init x86_64_start_kernel ( char * real_mode_data )
2005-04-16 15:20:36 -07:00
{
int i ;
2008-02-21 13:45:16 +01:00
/*
* Build - time sanity checks on the kernel image and module
* area mappings . ( these are purely build - time and produce no code )
*/
2013-03-04 21:16:17 +01:00
BUILD_BUG_ON ( MODULES_VADDR < __START_KERNEL_map ) ;
BUILD_BUG_ON ( MODULES_VADDR - __START_KERNEL_map < KERNEL_IMAGE_SIZE ) ;
2008-02-21 13:45:16 +01:00
BUILD_BUG_ON ( MODULES_LEN + KERNEL_IMAGE_SIZE > 2 * PUD_SIZE ) ;
2013-03-04 21:16:17 +01:00
BUILD_BUG_ON ( ( __START_KERNEL_map & ~ PMD_MASK ) ! = 0 ) ;
2008-02-21 13:45:16 +01:00
BUILD_BUG_ON ( ( MODULES_VADDR & ~ PMD_MASK ) ! = 0 ) ;
BUILD_BUG_ON ( ! ( MODULES_VADDR > __START_KERNEL ) ) ;
BUILD_BUG_ON ( ! ( ( ( MODULES_END - 1 ) & PGDIR_MASK ) = =
( __START_KERNEL & PGDIR_MASK ) ) ) ;
2008-07-31 16:48:31 +01:00
BUILD_BUG_ON ( __fix_to_virt ( __end_of_fixed_addresses ) < = MODULES_END ) ;
2008-02-21 13:45:16 +01:00
2014-10-24 15:58:08 -07:00
cr4_init_shadow ( ) ;
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
/* Kill off the identity-map trampoline */
reset_early_page_tables ( ) ;
2006-12-07 02:14:12 +01:00
clear_bss ( ) ;
2017-06-06 14:31:27 +03:00
clear_page ( init_top_pgt ) ;
2015-07-02 12:09:33 +03:00
2017-07-17 16:10:07 -05:00
/*
* SME support may update early_pmd_flags to include the memory
* encryption mask , so it needs to be called before anything
* that may generate a page fault .
*/
sme_early_init ( ) ;
2015-07-02 12:09:34 +03:00
kasan_early_init ( ) ;
2013-02-22 13:09:51 -08:00
for ( i = 0 ; i < NUM_EXCEPTION_VECTORS ; i + + )
2015-05-22 16:15:47 -07:00
set_intr_gate ( i , early_idt_handler_array [ i ] ) ;
2007-10-19 20:35:03 +02:00
load_idt ( ( const struct desc_ptr * ) & idt_descr ) ;
2005-11-05 17:25:53 +01:00
2013-01-24 12:19:49 -08:00
copy_bootdata ( __va ( real_mode_data ) ) ;
2012-12-20 23:44:30 -08:00
/*
* Load microcode early on BSP .
*/
load_ucode_bsp ( ) ;
2017-06-06 14:31:27 +03:00
/* set init_top_pgt kernel high mapping*/
init_top_pgt [ 511 ] = early_top_pgt [ 511 ] ;
x86, 64bit: Use a #PF handler to materialize early mappings on demand
Linear mode (CR0.PG = 0) is mutually exclusive with 64-bit mode; all
64-bit code has to use page tables. This makes it awkward before we
have first set up properly all-covering page tables to access objects
that are outside the static kernel range.
So far we have dealt with that simply by mapping a fixed amount of
low memory, but that fails in at least two upcoming use cases:
1. We will support load and run kernel, struct boot_params, ramdisk,
command line, etc. above the 4 GiB mark.
2. need to access ramdisk early to get microcode to update that as
early possible.
We could use early_iomap to access them too, but it will make code to
messy and hard to be unified with 32 bit.
Hence, set up a #PF table and use a fixed number of buffers to set up
page tables on demand. If the buffers fill up then we simply flush
them and start over. These buffers are all in __initdata, so it does
not increase RAM usage at runtime.
Thus, with the help of the #PF handler, we can set the final kernel
mapping from blank, and switch to init_level4_pgt later.
During the switchover in head_64.S, before #PF handler is available,
we use three pages to handle kernel crossing 1G, 512G boundaries with
sharing page by playing games with page aliasing: the same page is
mapped twice in the higher-level tables with appropriate wraparound.
The kernel region itself will be properly mapped; other mappings may
be spurious.
early_make_pgtable is using kernel high mapping address to access pages
to set page table.
-v4: Add phys_base offset to make kexec happy, and add
init_mapping_kernel() - Yinghai
-v5: fix compiling with xen, and add back ident level3 and level2 for xen
also move back init_level4_pgt from BSS to DATA again.
because we have to clear it anyway. - Yinghai
-v6: switch to init_level4_pgt in init_mem_mapping. - Yinghai
-v7: remove not needed clear_page for init_level4_page
it is with fill 512,8,0 already in head_64.S - Yinghai
-v8: we need to keep that handler alive until init_mem_mapping and don't
let early_trap_init to trash that early #PF handler.
So split early_trap_pf_init out and move it down. - Yinghai
-v9: switchover only cover kernel space instead of 1G so could avoid
touch possible mem holes. - Yinghai
-v11: change far jmp back to far return to initial_code, that is needed
to fix failure that is reported by Konrad on AMD systems. - Yinghai
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1359058816-7615-12-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-01-24 12:19:52 -08:00
2008-06-25 00:19:18 -04:00
x86_64_start_reservations ( real_mode_data ) ;
}
void __init x86_64_start_reservations ( char * real_mode_data )
{
2013-01-24 12:19:49 -08:00
/* version is always not zero if it is copied */
if ( ! boot_params . hdr . version )
copy_bootdata ( __va ( real_mode_data ) ) ;
2008-01-30 13:30:46 +01:00
2016-04-13 17:04:34 -07:00
x86_early_init_platform_quirks ( ) ;
2008-01-30 13:33:17 +01:00
2016-01-15 22:11:07 +02:00
switch ( boot_params . hdr . hardware_subarch ) {
case X86_SUBARCH_INTEL_MID :
x86_intel_mid_early_setup ( ) ;
break ;
default :
break ;
}
2005-04-16 15:20:36 -07:00
start_kernel ( ) ;
}