2005-04-17 02:20:36 +04:00
/*
* x86 SMP booting functions
*
* ( c ) 1995 Alan Cox , Building # 3 < alan @ redhat . com >
* ( c ) 1998 , 1999 , 2000 Ingo Molnar < mingo @ redhat . com >
*
* Much of the core SMP work is based on previous work by Thomas Radke , to
* whom a great many thanks are extended .
*
* Thanks to Intel for making available several different Pentium ,
* Pentium Pro and Pentium - II / Xeon MP machines .
* Original development of Linux SMP code supported by Caldera .
*
* This code is released under the GNU General Public License version 2 or
* later .
*
* Fixes
* Felix Koop : NR_CPUS used properly
* Jose Renau : Handle single CPU case .
* Alan Cox : By repeated request 8 ) - Total BogoMIPS report .
* Greg Wright : Fix for kernel stacks panic .
* Erich Boleyn : MP v1 .4 and additional changes .
* Matthias Sattler : Changes for 2.1 kernel map .
* Michel Lespinasse : Changes for 2.1 kernel map .
* Michael Chastain : Change trampoline . S to gnu as .
* Alan Cox : Dumb bug : ' B ' step PPro ' s are fine
* Ingo Molnar : Added APIC timers , based on code
* from Jose Renau
* Ingo Molnar : various cleanups and rewrites
* Tigran Aivazian : fixed " 0.00 in /proc/uptime on SMP " bug .
* Maciej W . Rozycki : Bits for genuine 82489 DX APICs
* Martin J . Bligh : Added support for multi - quad systems
* Dave Jones : Report invalid combinations of Athlon CPUs .
* Rusty Russell : Hacked into shape for new " hotplug " boot process . */
# include <linux/module.h>
# include <linux/init.h>
# include <linux/kernel.h>
# include <linux/mm.h>
# include <linux/sched.h>
# include <linux/kernel_stat.h>
# include <linux/bootmem.h>
2005-06-26 01:54:50 +04:00
# include <linux/notifier.h>
# include <linux/cpu.h>
# include <linux/percpu.h>
2007-03-07 20:12:31 +03:00
# include <linux/nmi.h>
2005-04-17 02:20:36 +04:00
# include <linux/delay.h>
# include <linux/mc146818rtc.h>
# include <asm/tlbflush.h>
# include <asm/desc.h>
# include <asm/arch_hooks.h>
2006-06-26 15:57:01 +04:00
# include <asm/nmi.h>
2005-04-17 02:20:36 +04:00
# include <mach_apic.h>
# include <mach_wakecpu.h>
# include <smpboot_hooks.h>
2007-02-13 15:26:21 +03:00
# include <asm/vmi.h>
[PATCH] x86: Save the MTRRs of the BSP before booting an AP
Applied fix by Andew Morton:
http://lkml.org/lkml/2007/4/8/88 - Fix `make headers_check'.
AMD and Intel x86 CPU manuals state that it is the responsibility of
system software to initialize and maintain MTRR consistency across
all processors in Multi-Processing Environments.
Quote from page 188 of the AMD64 System Programming manual (Volume 2):
7.6.5 MTRRs in Multi-Processing Environments
"In multi-processing environments, the MTRRs located in all processors must
characterize memory in the same way. Generally, this means that identical
values are written to the MTRRs used by the processors." (short omission here)
"Failure to do so may result in coherency violations or loss of atomicity.
Processor implementations do not check the MTRR settings in other processors
to ensure consistency. It is the responsibility of system software to
initialize and maintain MTRR consistency across all processors."
Current Linux MTRR code already implements the above in the case that the
BIOS does not properly initialize MTRRs on the secondary processors,
but the case where the fixed-range MTRRs of the boot processor are changed
after Linux started to boot, before the initialsation of a secondary
processor, is not handled yet.
In this case, secondary processors are currently initialized by Linux
with MTRRs which the boot processor had very early, when mtrr_bp_init()
did run, but not with the MTRRs which the boot processor uses at the
time when that secondary processors is actually booted,
causing differing MTRR contents on the secondary processors.
Such situation happens on Acer Ferrari 1000 and 5000 notebooks where the
BIOS enables and sets AMD-specific IORR bits in the fixed-range MTRRs
of the boot processor when it transitions the system into ACPI mode.
The SMI handler of the BIOS does this in SMM, entered while Linux ACPI
code runs acpi_enable().
Other occasions where the SMI handler of the BIOS may change bits in
the MTRRs could occur as well. To initialize newly booted secodary
processors with the fixed-range MTRRs which the boot processor uses
at that time, this patch saves the fixed-range MTRRs of the boot
processor before new secondary processors are started. When the
secondary processors run their Linux initialisation code, their
fixed-range MTRRs will be updated with the saved fixed-range MTRRs.
If CONFIG_MTRR is not set, we define mtrr_save_state
as an empty statement because there is nothing to do.
Possible TODOs:
*) CPU-hotplugging outside of SMP suspend/resume is not yet tested
with this patch.
*) If, even in this case, an AP never runs i386/do_boot_cpu or x86_64/cpu_up,
then the calls to mtrr_save_state() could be replaced by calls to
mtrr_save_fixed_ranges(NULL) and mtrr_save_state() would not be
needed.
That would need either verification of the CPU-hotplug code or
at least a test on a >2 CPU machine.
*) The MTRRs of other running processors are not yet checked at this
time but it might be interesting to syncronize the MTTRs of all
processors before booting. That would be an incremental patch,
but of rather low priority since there is no machine known so
far which would require this.
AK: moved prototypes on x86-64 around to fix warnings
Signed-off-by: Bernhard Kaindl <bk@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
2007-05-02 21:27:17 +04:00
# include <asm/mtrr.h>
2005-04-17 02:20:36 +04:00
/* Set if we find a B stepping CPU */
2007-12-20 01:20:18 +03:00
static int __cpuinitdata smp_b_stepping ;
2005-04-17 02:20:36 +04:00
/* Number of siblings per CPU package */
int smp_num_siblings = 1 ;
2005-06-23 11:08:33 +04:00
EXPORT_SYMBOL ( smp_num_siblings ) ;
2005-06-26 01:54:54 +04:00
2006-03-27 13:15:22 +04:00
/* Last level cache ID of each logical CPU */
2007-10-19 22:35:03 +04:00
DEFINE_PER_CPU ( u8 , cpu_llc_id ) = BAD_APICID ;
2006-03-27 13:15:22 +04:00
2005-11-05 19:25:54 +03:00
/* representing HT siblings of each logical CPU */
2007-10-16 12:24:05 +04:00
DEFINE_PER_CPU ( cpumask_t , cpu_sibling_map ) ;
EXPORT_PER_CPU_SYMBOL ( cpu_sibling_map ) ;
2005-06-26 01:54:54 +04:00
2005-11-05 19:25:54 +03:00
/* representing HT and core siblings of each logical CPU */
2007-10-16 12:24:04 +04:00
DEFINE_PER_CPU ( cpumask_t , cpu_core_map ) ;
EXPORT_PER_CPU_SYMBOL ( cpu_core_map ) ;
2005-06-26 01:54:54 +04:00
2005-04-17 02:20:36 +04:00
/* bitmap of online cpus */
2005-07-08 04:56:59 +04:00
cpumask_t cpu_online_map __read_mostly ;
2005-06-23 11:08:33 +04:00
EXPORT_SYMBOL ( cpu_online_map ) ;
2005-04-17 02:20:36 +04:00
cpumask_t cpu_callin_map ;
cpumask_t cpu_callout_map ;
2005-09-04 02:56:51 +04:00
cpumask_t cpu_possible_map ;
EXPORT_SYMBOL ( cpu_possible_map ) ;
2005-04-17 02:20:36 +04:00
static cpumask_t smp_commenced_mask ;
/* Per CPU bogomips and other parameters */
2007-10-19 22:35:04 +04:00
DEFINE_PER_CPU_SHARED_ALIGNED ( struct cpuinfo_x86 , cpu_info ) ;
EXPORT_PER_CPU_SYMBOL ( cpu_info ) ;
2005-04-17 02:20:36 +04:00
2007-10-19 22:35:03 +04:00
/*
* The following static array is used during kernel startup
* and the x86_cpu_to_apicid_ptr contains the address of the
* array during this time . Is it zeroed when the per_cpu
* data area is removed .
*/
u8 x86_cpu_to_apicid_init [ NR_CPUS ] __initdata =
{ [ 0 . . . NR_CPUS - 1 ] = BAD_APICID } ;
void * x86_cpu_to_apicid_ptr ;
DEFINE_PER_CPU ( u8 , x86_cpu_to_apicid ) = BAD_APICID ;
EXPORT_PER_CPU_SYMBOL ( x86_cpu_to_apicid ) ;
2005-04-17 02:20:36 +04:00
2006-09-29 12:58:46 +04:00
u8 apicid_2_node [ MAX_APICID ] ;
2005-04-17 02:20:36 +04:00
/*
* Trampoline 80 x86 program as an array .
*/
2007-10-17 20:04:37 +04:00
extern const unsigned char trampoline_data [ ] ;
extern const unsigned char trampoline_end [ ] ;
2005-04-17 02:20:36 +04:00
static unsigned char * trampoline_base ;
static int trampoline_exec ;
static void map_cpu_to_logical_apicid ( void ) ;
2005-06-26 01:54:50 +04:00
/* State of each CPU. */
DEFINE_PER_CPU ( int , cpu_state ) = { 0 } ;
2005-04-17 02:20:36 +04:00
/*
* Currently trivial . Write the real - > protected mode
* bootstrap into the page concerned . The caller
* has made sure it ' s suitably aligned .
*/
2007-10-17 20:04:32 +04:00
static unsigned long __cpuinit setup_trampoline ( void )
2005-04-17 02:20:36 +04:00
{
memcpy ( trampoline_base , trampoline_data , trampoline_end - trampoline_data ) ;
return virt_to_phys ( trampoline_base ) ;
}
/*
* We are called very early to get the low memory for the
* SMP bootup trampoline page .
*/
void __init smp_alloc_memory ( void )
{
trampoline_base = ( void * ) alloc_bootmem_low_pages ( PAGE_SIZE ) ;
/*
* Has to be in very low memory so we can execute
* real - mode AP code .
*/
if ( __pa ( trampoline_base ) > = 0x9F000 )
BUG ( ) ;
/*
* Make the SMP trampoline executable :
*/
trampoline_exec = set_kernel_exec ( ( unsigned long ) trampoline_base , 1 ) ;
}
/*
* The bootstrap kernel entry code has set these up . Save them for
* a given CPU
*/
2007-07-18 05:37:03 +04:00
void __cpuinit smp_store_cpu_info ( int id )
2005-04-17 02:20:36 +04:00
{
2007-10-19 22:35:04 +04:00
struct cpuinfo_x86 * c = & cpu_data ( id ) ;
2005-04-17 02:20:36 +04:00
* c = boot_cpu_data ;
2007-10-19 22:35:04 +04:00
c - > cpu_index = id ;
2005-04-17 02:20:36 +04:00
if ( id ! = 0 )
2007-05-02 21:27:12 +04:00
identify_secondary_cpu ( c ) ;
2005-04-17 02:20:36 +04:00
/*
* Mask B , Pentium , but not Pentium MMX
*/
if ( c - > x86_vendor = = X86_VENDOR_INTEL & &
c - > x86 = = 5 & &
c - > x86_mask > = 1 & & c - > x86_mask < = 4 & &
c - > x86_model < = 3 )
/*
* Remember we have B step Pentia with bugs
*/
smp_b_stepping = 1 ;
/*
* Certain Athlons might work ( for various values of ' work ' ) in SMP
* but they are not certified as MP capable .
*/
if ( ( c - > x86_vendor = = X86_VENDOR_AMD ) & & ( c - > x86 = = 6 ) ) {
2006-09-26 12:52:34 +04:00
if ( num_possible_cpus ( ) = = 1 )
goto valid_k7 ;
2005-04-17 02:20:36 +04:00
/* Athlon 660/661 is valid. */
if ( ( c - > x86_model = = 6 ) & & ( ( c - > x86_mask = = 0 ) | | ( c - > x86_mask = = 1 ) ) )
goto valid_k7 ;
/* Duron 670 is valid */
if ( ( c - > x86_model = = 7 ) & & ( c - > x86_mask = = 0 ) )
goto valid_k7 ;
/*
* Athlon 662 , Duron 671 , and Athlon > model 7 have capability bit .
* It ' s worth noting that the A5 stepping ( 662 ) of some Athlon XP ' s
* have the MP bit set .
* See http : //www.heise.de/newsticker/data/jow-18.10.01-000 for more.
*/
if ( ( ( c - > x86_model = = 6 ) & & ( c - > x86_mask > = 2 ) ) | |
( ( c - > x86_model = = 7 ) & & ( c - > x86_mask > = 1 ) ) | |
( c - > x86_model > 7 ) )
if ( cpu_has_mp )
goto valid_k7 ;
/* If we get here, it's not a certified SMP capable AMD system. */
2005-09-13 12:25:16 +04:00
add_taint ( TAINT_UNSAFE_SMP ) ;
2005-04-17 02:20:36 +04:00
}
valid_k7 :
;
}
extern void calibrate_delay ( void ) ;
static atomic_t init_deasserted ;
2007-01-11 03:52:44 +03:00
static void __cpuinit smp_callin ( void )
2005-04-17 02:20:36 +04:00
{
int cpuid , phys_id ;
unsigned long timeout ;
/*
* If waken up by an INIT in an 82489 DX configuration
* we may get here before an INIT - deassert IPI reaches
* our local APIC . We have to wait for the IPI or we ' ll
* lock up on an APIC access .
*/
wait_for_init_deassert ( & init_deasserted ) ;
/*
* ( This works even if the APIC is not enabled . )
*/
phys_id = GET_APIC_ID ( apic_read ( APIC_ID ) ) ;
cpuid = smp_processor_id ( ) ;
if ( cpu_isset ( cpuid , cpu_callin_map ) ) {
printk ( " huh, phys CPU#%d, CPU#%d already present?? \n " ,
phys_id , cpuid ) ;
BUG ( ) ;
}
Dprintk ( " CPU#%d (phys ID: %d) waiting for CALLOUT \n " , cpuid , phys_id ) ;
/*
* STARTUP IPIs are fragile beasts as they might sometimes
* trigger some glue motherboard logic . Complete APIC bus
* silence for 1 second , this overestimates the time the
* boot CPU is spending to send the up to 2 STARTUP IPIs
* by a factor of two . This should be enough .
*/
/*
* Waiting 2 s total for startup ( udelay is not yet working )
*/
timeout = jiffies + 2 * HZ ;
while ( time_before ( jiffies , timeout ) ) {
/*
* Has the boot CPU finished it ' s STARTUP sequence ?
*/
if ( cpu_isset ( cpuid , cpu_callout_map ) )
break ;
rep_nop ( ) ;
}
if ( ! time_before ( jiffies , timeout ) ) {
printk ( " BUG: CPU%d started up but did not get a callout! \n " ,
cpuid ) ;
BUG ( ) ;
}
/*
* the boot CPU has finished the init stage and is spinning
* on callin_map until we finish . We are free to set up this
* CPU , first the APIC . ( this is probably redundant on most
* boards )
*/
Dprintk ( " CALLIN, before setup_local_APIC(). \n " ) ;
smp_callin_clear_local_apic ( ) ;
setup_local_APIC ( ) ;
map_cpu_to_logical_apicid ( ) ;
/*
* Get our bogomips .
*/
calibrate_delay ( ) ;
Dprintk ( " Stack at about %p \n " , & cpuid ) ;
/*
* Save our processor parameters
*/
2007-02-16 12:28:04 +03:00
smp_store_cpu_info ( cpuid ) ;
2005-04-17 02:20:36 +04:00
/*
* Allow the master to continue .
*/
cpu_set ( cpuid , cpu_callin_map ) ;
}
static int cpucount ;
2006-03-27 13:15:22 +04:00
/* maps the cpu to the sched domain representing multi-core */
cpumask_t cpu_coregroup_map ( int cpu )
{
2007-10-19 22:35:04 +04:00
struct cpuinfo_x86 * c = & cpu_data ( cpu ) ;
2006-03-27 13:15:22 +04:00
/*
* For perf , we return last level cache shared map .
2006-06-27 13:54:42 +04:00
* And for power savings , we return cpu_core_map
2006-03-27 13:15:22 +04:00
*/
2006-06-27 13:54:42 +04:00
if ( sched_mc_power_savings | | sched_smt_power_savings )
2007-10-16 12:24:04 +04:00
return per_cpu ( cpu_core_map , cpu ) ;
2006-06-27 13:54:42 +04:00
else
return c - > llc_shared_map ;
2006-03-27 13:15:22 +04:00
}
2005-11-05 19:25:54 +03:00
/* representing cpus for which sibling maps can be computed */
static cpumask_t cpu_sibling_setup_map ;
2007-07-22 13:12:33 +04:00
void __cpuinit set_cpu_sibling_map ( int cpu )
2005-06-26 01:54:54 +04:00
{
int i ;
2007-10-19 22:35:04 +04:00
struct cpuinfo_x86 * c = & cpu_data ( cpu ) ;
2005-11-05 19:25:54 +03:00
cpu_set ( cpu , cpu_sibling_setup_map ) ;
2005-06-26 01:54:54 +04:00
if ( smp_num_siblings > 1 ) {
2005-11-05 19:25:54 +03:00
for_each_cpu_mask ( i , cpu_sibling_setup_map ) {
2007-10-19 22:35:04 +04:00
if ( c - > phys_proc_id = = cpu_data ( i ) . phys_proc_id & &
c - > cpu_core_id = = cpu_data ( i ) . cpu_core_id ) {
2007-10-16 12:24:05 +04:00
cpu_set ( i , per_cpu ( cpu_sibling_map , cpu ) ) ;
cpu_set ( cpu , per_cpu ( cpu_sibling_map , i ) ) ;
2007-10-16 12:24:04 +04:00
cpu_set ( i , per_cpu ( cpu_core_map , cpu ) ) ;
cpu_set ( cpu , per_cpu ( cpu_core_map , i ) ) ;
2007-10-19 22:35:04 +04:00
cpu_set ( i , c - > llc_shared_map ) ;
cpu_set ( cpu , cpu_data ( i ) . llc_shared_map ) ;
2005-06-26 01:54:54 +04:00
}
}
} else {
2007-10-16 12:24:05 +04:00
cpu_set ( cpu , per_cpu ( cpu_sibling_map , cpu ) ) ;
2005-06-26 01:54:54 +04:00
}
2007-10-19 22:35:04 +04:00
cpu_set ( cpu , c - > llc_shared_map ) ;
2006-03-27 13:15:22 +04:00
2005-11-05 19:25:54 +03:00
if ( current_cpu_data . x86_max_cores = = 1 ) {
2007-10-16 12:24:05 +04:00
per_cpu ( cpu_core_map , cpu ) = per_cpu ( cpu_sibling_map , cpu ) ;
2007-10-19 22:35:04 +04:00
c - > booted_cores = 1 ;
2005-11-05 19:25:54 +03:00
return ;
}
for_each_cpu_mask ( i , cpu_sibling_setup_map ) {
2007-10-19 22:35:03 +04:00
if ( per_cpu ( cpu_llc_id , cpu ) ! = BAD_APICID & &
per_cpu ( cpu_llc_id , cpu ) = = per_cpu ( cpu_llc_id , i ) ) {
2007-10-19 22:35:04 +04:00
cpu_set ( i , c - > llc_shared_map ) ;
cpu_set ( cpu , cpu_data ( i ) . llc_shared_map ) ;
2006-03-27 13:15:22 +04:00
}
2007-10-19 22:35:04 +04:00
if ( c - > phys_proc_id = = cpu_data ( i ) . phys_proc_id ) {
2007-10-16 12:24:04 +04:00
cpu_set ( i , per_cpu ( cpu_core_map , cpu ) ) ;
cpu_set ( cpu , per_cpu ( cpu_core_map , i ) ) ;
2005-11-05 19:25:54 +03:00
/*
* Does this new cpu bringup a new core ?
*/
2007-10-16 12:24:05 +04:00
if ( cpus_weight ( per_cpu ( cpu_sibling_map , cpu ) ) = = 1 ) {
2005-11-05 19:25:54 +03:00
/*
* for each core in package , increment
* the booted_cores for this new cpu
*/
2007-10-16 12:24:05 +04:00
if ( first_cpu ( per_cpu ( cpu_sibling_map , i ) ) = = i )
2007-10-19 22:35:04 +04:00
c - > booted_cores + + ;
2005-11-05 19:25:54 +03:00
/*
* increment the core count for all
* the other cpus in this package
*/
if ( i ! = cpu )
2007-10-19 22:35:04 +04:00
cpu_data ( i ) . booted_cores + + ;
} else if ( i ! = cpu & & ! c - > booted_cores )
c - > booted_cores = cpu_data ( i ) . booted_cores ;
2005-11-05 19:25:54 +03:00
}
2005-06-26 01:54:54 +04:00
}
}
2005-04-17 02:20:36 +04:00
/*
* Activate a secondary processor .
*/
2007-01-11 03:52:44 +03:00
static void __cpuinit start_secondary ( void * unused )
2005-04-17 02:20:36 +04:00
{
/*
2007-05-02 21:27:10 +04:00
* Don ' t put * anything * before cpu_init ( ) , SMP booting is too
* fragile that we want to limit the things done here to the
* most necessary things .
2005-04-17 02:20:36 +04:00
*/
2007-02-13 15:26:21 +03:00
# ifdef CONFIG_VMI
vmi_bringup ( ) ;
# endif
2007-05-02 21:27:10 +04:00
cpu_init ( ) ;
2005-11-09 08:39:01 +03:00
preempt_disable ( ) ;
2005-04-17 02:20:36 +04:00
smp_callin ( ) ;
while ( ! cpu_isset ( smp_processor_id ( ) , smp_commenced_mask ) )
rep_nop ( ) ;
2007-02-16 12:27:34 +03:00
/*
* Check TSC synchronization with the BP :
*/
check_tsc_sync_target ( ) ;
2007-02-13 15:26:21 +03:00
setup_secondary_clock ( ) ;
2005-04-17 02:20:36 +04:00
if ( nmi_watchdog = = NMI_IO_APIC ) {
disable_8259A_irq ( 0 ) ;
enable_NMI_through_LVT0 ( NULL ) ;
enable_8259A_irq ( 0 ) ;
}
/*
* low - memory mappings have been cleared , flush them from
* the local TLBs too .
*/
local_flush_tlb ( ) ;
2005-06-26 01:54:53 +04:00
2005-06-26 01:54:54 +04:00
/* This must be done before setting cpu_online_map */
set_cpu_sibling_map ( raw_smp_processor_id ( ) ) ;
wmb ( ) ;
2005-06-26 01:54:53 +04:00
/*
* We need to hold call_lock , so there is no inconsistency
* between the time smp_call_function ( ) determines number of
2007-10-20 03:13:56 +04:00
* IPI recipients , and the time when the determination is made
2005-06-26 01:54:53 +04:00
* for which cpus receive the IPI . Holding this
* lock helps us to not include this cpu in a currently in progress
* smp_call_function ( ) .
*/
lock_ipi_call_lock ( ) ;
2005-04-17 02:20:36 +04:00
cpu_set ( smp_processor_id ( ) , cpu_online_map ) ;
2005-06-26 01:54:53 +04:00
unlock_ipi_call_lock ( ) ;
2005-06-26 01:54:56 +04:00
per_cpu ( cpu_state , smp_processor_id ( ) ) = CPU_ONLINE ;
2005-04-17 02:20:36 +04:00
/* We can take interrupts now: we're officially "up". */
local_irq_enable ( ) ;
wmb ( ) ;
cpu_idle ( ) ;
}
/*
* Everything has been set up for the secondary
* CPUs - they just need to reload everything
* from the task structure
* This function must not return .
*/
2005-06-26 01:54:55 +04:00
void __devinit initialize_secondary ( void )
2005-04-17 02:20:36 +04:00
{
/*
* We don ' t actually need to load the full TSS ,
* basically just the stack pointer and the eip .
*/
asm volatile (
" movl %0,%%esp \n \t "
" jmp *%1 "
:
2006-12-07 04:14:02 +03:00
: " m " ( current - > thread . esp ) , " m " ( current - > thread . eip ) ) ;
2005-04-17 02:20:36 +04:00
}
2006-12-07 04:14:02 +03:00
/* Static state in head.S used to set up a CPU */
2005-04-17 02:20:36 +04:00
extern struct {
void * esp ;
unsigned short ss ;
} stack_start ;
# ifdef CONFIG_NUMA
/* which logical CPUs are on which nodes */
2008-01-30 15:30:38 +03:00
cpumask_t node_to_cpumask_map [ MAX_NUMNODES ] __read_mostly =
2005-04-17 02:20:36 +04:00
{ [ 0 . . . MAX_NUMNODES - 1 ] = CPU_MASK_NONE } ;
2008-01-30 15:30:38 +03:00
EXPORT_SYMBOL ( node_to_cpumask_map ) ;
2005-04-17 02:20:36 +04:00
/* which node each logical CPU is on */
2008-01-30 15:30:38 +03:00
int cpu_to_node_map [ NR_CPUS ] __read_mostly = { [ 0 . . . NR_CPUS - 1 ] = 0 } ;
EXPORT_SYMBOL ( cpu_to_node_map ) ;
2005-04-17 02:20:36 +04:00
/* set up a mapping between cpu and node. */
static inline void map_cpu_to_node ( int cpu , int node )
{
printk ( " Mapping cpu %d to node %d \n " , cpu , node ) ;
2008-01-30 15:30:38 +03:00
cpu_set ( cpu , node_to_cpumask_map [ node ] ) ;
cpu_to_node_map [ cpu ] = node ;
2005-04-17 02:20:36 +04:00
}
/* undo a mapping between cpu and node. */
static inline void unmap_cpu_to_node ( int cpu )
{
int node ;
printk ( " Unmapping cpu %d from all nodes \n " , cpu ) ;
for ( node = 0 ; node < MAX_NUMNODES ; node + + )
2008-01-30 15:30:38 +03:00
cpu_clear ( cpu , node_to_cpumask_map [ node ] ) ;
cpu_to_node_map [ cpu ] = 0 ;
2005-04-17 02:20:36 +04:00
}
# else /* !CONFIG_NUMA */
# define map_cpu_to_node(cpu, node) ({})
# define unmap_cpu_to_node(cpu) ({})
# endif /* CONFIG_NUMA */
2005-07-08 04:56:59 +04:00
u8 cpu_2_logical_apicid [ NR_CPUS ] __read_mostly = { [ 0 . . . NR_CPUS - 1 ] = BAD_APICID } ;
2005-04-17 02:20:36 +04:00
static void map_cpu_to_logical_apicid ( void )
{
int cpu = smp_processor_id ( ) ;
int apicid = logical_smp_processor_id ( ) ;
2006-10-04 05:25:52 +04:00
int node = apicid_to_node ( apicid ) ;
2006-09-26 03:25:35 +04:00
if ( ! node_online ( node ) )
node = first_online_node ;
2005-04-17 02:20:36 +04:00
cpu_2_logical_apicid [ cpu ] = apicid ;
2006-09-26 03:25:35 +04:00
map_cpu_to_node ( cpu , node ) ;
2005-04-17 02:20:36 +04:00
}
static void unmap_cpu_to_logical_apicid ( int cpu )
{
cpu_2_logical_apicid [ cpu ] = BAD_APICID ;
unmap_cpu_to_node ( cpu ) ;
}
static inline void __inquire_remote_apic ( int apicid )
{
int i , regs [ ] = { APIC_ID > > 4 , APIC_LVR > > 4 , APIC_SPIV > > 4 } ;
char * names [ ] = { " ID " , " VERSION " , " SPIV " } ;
2007-05-02 21:27:17 +04:00
int timeout ;
unsigned long status ;
2005-04-17 02:20:36 +04:00
printk ( " Inquiring remote APIC #%d... \n " , apicid ) ;
2005-11-07 11:58:31 +03:00
for ( i = 0 ; i < ARRAY_SIZE ( regs ) ; i + + ) {
2005-04-17 02:20:36 +04:00
printk ( " ... APIC #%d %s: " , apicid , names [ i ] ) ;
/*
* Wait for idle .
*/
2007-05-02 21:27:17 +04:00
status = safe_apic_wait_icr_idle ( ) ;
if ( status )
printk ( " a previous APIC delivery may have failed \n " ) ;
2005-04-17 02:20:36 +04:00
apic_write_around ( APIC_ICR2 , SET_APIC_DEST_FIELD ( apicid ) ) ;
apic_write_around ( APIC_ICR , APIC_DM_REMRD | regs [ i ] ) ;
timeout = 0 ;
do {
udelay ( 100 ) ;
status = apic_read ( APIC_ICR ) & APIC_ICR_RR_MASK ;
} while ( status = = APIC_ICR_RR_INPROG & & timeout + + < 1000 ) ;
switch ( status ) {
case APIC_ICR_RR_VALID :
status = apic_read ( APIC_RRR ) ;
2007-05-02 21:27:21 +04:00
printk ( " %lx \n " , status ) ;
2005-04-17 02:20:36 +04:00
break ;
default :
printk ( " failed \n " ) ;
}
}
}
# ifdef WAKE_SECONDARY_VIA_NMI
/*
* Poke the other CPU in the eye via NMI to wake it up . Remember that the normal
* INIT , INIT , STARTUP sequence will reset the chip hard for us , and this
* won ' t . . . remember to clear down the APIC , etc later .
*/
2005-06-26 01:54:55 +04:00
static int __devinit
2005-04-17 02:20:36 +04:00
wakeup_secondary_cpu ( int logical_apicid , unsigned long start_eip )
{
2007-05-02 21:27:17 +04:00
unsigned long send_status , accept_status = 0 ;
int maxlvt ;
2005-04-17 02:20:36 +04:00
/* Target chip */
apic_write_around ( APIC_ICR2 , SET_APIC_DEST_FIELD ( logical_apicid ) ) ;
/* Boot on the stack */
/* Kick the second */
apic_write_around ( APIC_ICR , APIC_DM_NMI | APIC_DEST_LOGICAL ) ;
Dprintk ( " Waiting for send to finish... \n " ) ;
2007-05-02 21:27:17 +04:00
send_status = safe_apic_wait_icr_idle ( ) ;
2005-04-17 02:20:36 +04:00
/*
* Give the other CPU some time to accept the IPI .
*/
udelay ( 200 ) ;
/*
* Due to the Pentium erratum 3 AP .
*/
2007-02-16 12:27:58 +03:00
maxlvt = lapic_get_maxlvt ( ) ;
2005-04-17 02:20:36 +04:00
if ( maxlvt > 3 ) {
apic_read_around ( APIC_SPIV ) ;
apic_write ( APIC_ESR , 0 ) ;
}
accept_status = ( apic_read ( APIC_ESR ) & 0xEF ) ;
Dprintk ( " NMI sent. \n " ) ;
if ( send_status )
printk ( " APIC never delivered??? \n " ) ;
if ( accept_status )
printk ( " APIC delivery error (%lx). \n " , accept_status ) ;
return ( send_status | accept_status ) ;
}
# endif /* WAKE_SECONDARY_VIA_NMI */
# ifdef WAKE_SECONDARY_VIA_INIT
2005-06-26 01:54:55 +04:00
static int __devinit
2005-04-17 02:20:36 +04:00
wakeup_secondary_cpu ( int phys_apicid , unsigned long start_eip )
{
2007-05-02 21:27:17 +04:00
unsigned long send_status , accept_status = 0 ;
int maxlvt , num_starts , j ;
2005-04-17 02:20:36 +04:00
/*
* Be paranoid about clearing APIC errors .
*/
if ( APIC_INTEGRATED ( apic_version [ phys_apicid ] ) ) {
apic_read_around ( APIC_SPIV ) ;
apic_write ( APIC_ESR , 0 ) ;
apic_read ( APIC_ESR ) ;
}
Dprintk ( " Asserting INIT. \n " ) ;
/*
* Turn INIT on target chip
*/
apic_write_around ( APIC_ICR2 , SET_APIC_DEST_FIELD ( phys_apicid ) ) ;
/*
* Send IPI
*/
apic_write_around ( APIC_ICR , APIC_INT_LEVELTRIG | APIC_INT_ASSERT
| APIC_DM_INIT ) ;
Dprintk ( " Waiting for send to finish... \n " ) ;
2007-05-02 21:27:17 +04:00
send_status = safe_apic_wait_icr_idle ( ) ;
2005-04-17 02:20:36 +04:00
mdelay ( 10 ) ;
Dprintk ( " Deasserting INIT. \n " ) ;
/* Target chip */
apic_write_around ( APIC_ICR2 , SET_APIC_DEST_FIELD ( phys_apicid ) ) ;
/* Send IPI */
apic_write_around ( APIC_ICR , APIC_INT_LEVELTRIG | APIC_DM_INIT ) ;
Dprintk ( " Waiting for send to finish... \n " ) ;
2007-05-02 21:27:17 +04:00
send_status = safe_apic_wait_icr_idle ( ) ;
2005-04-17 02:20:36 +04:00
atomic_set ( & init_deasserted , 1 ) ;
/*
* Should we send STARTUP IPIs ?
*
* Determine this based on the APIC version .
* If we don ' t have an integrated APIC , don ' t send the STARTUP IPIs .
*/
if ( APIC_INTEGRATED ( apic_version [ phys_apicid ] ) )
num_starts = 2 ;
else
num_starts = 0 ;
2007-02-13 15:26:21 +03:00
/*
* Paravirt / VMI wants a startup IPI hook here to set up the
* target processor state .
*/
startup_ipi_hook ( phys_apicid , ( unsigned long ) start_secondary ,
( unsigned long ) stack_start . esp ) ;
2005-04-17 02:20:36 +04:00
/*
* Run STARTUP IPI loop .
*/
Dprintk ( " #startup loops: %d. \n " , num_starts ) ;
2007-02-16 12:27:58 +03:00
maxlvt = lapic_get_maxlvt ( ) ;
2005-04-17 02:20:36 +04:00
for ( j = 1 ; j < = num_starts ; j + + ) {
Dprintk ( " Sending STARTUP #%d. \n " , j ) ;
apic_read_around ( APIC_SPIV ) ;
apic_write ( APIC_ESR , 0 ) ;
apic_read ( APIC_ESR ) ;
Dprintk ( " After apic_write. \n " ) ;
/*
* STARTUP IPI
*/
/* Target chip */
apic_write_around ( APIC_ICR2 , SET_APIC_DEST_FIELD ( phys_apicid ) ) ;
/* Boot on the stack */
/* Kick the second */
apic_write_around ( APIC_ICR , APIC_DM_STARTUP
| ( start_eip > > 12 ) ) ;
/*
* Give the other CPU some time to accept the IPI .
*/
udelay ( 300 ) ;
Dprintk ( " Startup point 1. \n " ) ;
Dprintk ( " Waiting for send to finish... \n " ) ;
2007-05-02 21:27:17 +04:00
send_status = safe_apic_wait_icr_idle ( ) ;
2005-04-17 02:20:36 +04:00
/*
* Give the other CPU some time to accept the IPI .
*/
udelay ( 200 ) ;
/*
* Due to the Pentium erratum 3 AP .
*/
if ( maxlvt > 3 ) {
apic_read_around ( APIC_SPIV ) ;
apic_write ( APIC_ESR , 0 ) ;
}
accept_status = ( apic_read ( APIC_ESR ) & 0xEF ) ;
if ( send_status | | accept_status )
break ;
}
Dprintk ( " After Startup. \n " ) ;
if ( send_status )
printk ( " APIC never delivered??? \n " ) ;
if ( accept_status )
printk ( " APIC delivery error (%lx). \n " , accept_status ) ;
return ( send_status | accept_status ) ;
}
# endif /* WAKE_SECONDARY_VIA_INIT */
extern cpumask_t cpu_initialized ;
2005-06-26 01:54:56 +04:00
static inline int alloc_cpu_id ( void )
{
cpumask_t tmp_map ;
int cpu ;
cpus_complement ( tmp_map , cpu_present_map ) ;
cpu = first_cpu ( tmp_map ) ;
if ( cpu > = NR_CPUS )
return - ENODEV ;
return cpu ;
}
# ifdef CONFIG_HOTPLUG_CPU
2007-12-20 01:20:18 +03:00
static struct task_struct * __cpuinitdata cpu_idle_tasks [ NR_CPUS ] ;
static inline struct task_struct * __cpuinit alloc_idle_task ( int cpu )
2005-06-26 01:54:56 +04:00
{
struct task_struct * idle ;
if ( ( idle = cpu_idle_tasks [ cpu ] ) ! = NULL ) {
/* initialize thread_struct. we really want to avoid destroy
* idle tread
*/
2006-01-12 12:05:41 +03:00
idle - > thread . esp = ( unsigned long ) task_pt_regs ( idle ) ;
2005-06-26 01:54:56 +04:00
init_idle ( idle , cpu ) ;
return idle ;
}
idle = fork_idle ( cpu ) ;
if ( ! IS_ERR ( idle ) )
cpu_idle_tasks [ cpu ] = idle ;
return idle ;
}
# else
# define alloc_idle_task(cpu) fork_idle(cpu)
# endif
2005-04-17 02:20:36 +04:00
2007-01-11 03:52:44 +03:00
static int __cpuinit do_boot_cpu ( int apicid , int cpu )
2005-04-17 02:20:36 +04:00
/*
* NOTE - on most systems this is a PHYSICAL apic ID , but on multiquad
* ( ie clustered apic addressing mode ) , this is a LOGICAL apic ID .
* Returns zero if CPU booted OK , else error code from wakeup_secondary_cpu .
*/
{
struct task_struct * idle ;
unsigned long boot_error ;
2005-06-26 01:54:56 +04:00
int timeout ;
2005-04-17 02:20:36 +04:00
unsigned long start_eip ;
unsigned short nmi_high = 0 , nmi_low = 0 ;
[PATCH] x86: Save the MTRRs of the BSP before booting an AP
Applied fix by Andew Morton:
http://lkml.org/lkml/2007/4/8/88 - Fix `make headers_check'.
AMD and Intel x86 CPU manuals state that it is the responsibility of
system software to initialize and maintain MTRR consistency across
all processors in Multi-Processing Environments.
Quote from page 188 of the AMD64 System Programming manual (Volume 2):
7.6.5 MTRRs in Multi-Processing Environments
"In multi-processing environments, the MTRRs located in all processors must
characterize memory in the same way. Generally, this means that identical
values are written to the MTRRs used by the processors." (short omission here)
"Failure to do so may result in coherency violations or loss of atomicity.
Processor implementations do not check the MTRR settings in other processors
to ensure consistency. It is the responsibility of system software to
initialize and maintain MTRR consistency across all processors."
Current Linux MTRR code already implements the above in the case that the
BIOS does not properly initialize MTRRs on the secondary processors,
but the case where the fixed-range MTRRs of the boot processor are changed
after Linux started to boot, before the initialsation of a secondary
processor, is not handled yet.
In this case, secondary processors are currently initialized by Linux
with MTRRs which the boot processor had very early, when mtrr_bp_init()
did run, but not with the MTRRs which the boot processor uses at the
time when that secondary processors is actually booted,
causing differing MTRR contents on the secondary processors.
Such situation happens on Acer Ferrari 1000 and 5000 notebooks where the
BIOS enables and sets AMD-specific IORR bits in the fixed-range MTRRs
of the boot processor when it transitions the system into ACPI mode.
The SMI handler of the BIOS does this in SMM, entered while Linux ACPI
code runs acpi_enable().
Other occasions where the SMI handler of the BIOS may change bits in
the MTRRs could occur as well. To initialize newly booted secodary
processors with the fixed-range MTRRs which the boot processor uses
at that time, this patch saves the fixed-range MTRRs of the boot
processor before new secondary processors are started. When the
secondary processors run their Linux initialisation code, their
fixed-range MTRRs will be updated with the saved fixed-range MTRRs.
If CONFIG_MTRR is not set, we define mtrr_save_state
as an empty statement because there is nothing to do.
Possible TODOs:
*) CPU-hotplugging outside of SMP suspend/resume is not yet tested
with this patch.
*) If, even in this case, an AP never runs i386/do_boot_cpu or x86_64/cpu_up,
then the calls to mtrr_save_state() could be replaced by calls to
mtrr_save_fixed_ranges(NULL) and mtrr_save_state() would not be
needed.
That would need either verification of the CPU-hotplug code or
at least a test on a >2 CPU machine.
*) The MTRRs of other running processors are not yet checked at this
time but it might be interesting to syncronize the MTTRs of all
processors before booting. That would be an incremental patch,
but of rather low priority since there is no machine known so
far which would require this.
AK: moved prototypes on x86-64 around to fix warnings
Signed-off-by: Bernhard Kaindl <bk@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Cc: Andi Kleen <ak@suse.de>
Cc: Dave Jones <davej@codemonkey.org.uk>
2007-05-02 21:27:17 +04:00
/*
* Save current MTRR state in case it was changed since early boot
* ( e . g . by the ACPI SMI ) to initialize new CPUs with MTRRs in sync :
*/
mtrr_save_state ( ) ;
2005-04-17 02:20:36 +04:00
/*
* We can ' t use kernel_thread since we must avoid to
* reschedule the child .
*/
2005-06-26 01:54:56 +04:00
idle = alloc_idle_task ( cpu ) ;
2005-04-17 02:20:36 +04:00
if ( IS_ERR ( idle ) )
panic ( " failed fork for CPU %d " , cpu ) ;
2006-12-07 04:14:02 +03:00
2007-05-02 21:27:16 +04:00
init_gdt ( cpu ) ;
per_cpu ( current_task , cpu ) = idle ;
2007-05-02 21:27:10 +04:00
early_gdt_descr . address = ( unsigned long ) get_cpu_gdt_table ( cpu ) ;
2006-12-07 04:14:02 +03:00
2005-04-17 02:20:36 +04:00
idle - > thread . eip = ( unsigned long ) start_secondary ;
/* start_eip had better be page-aligned! */
start_eip = setup_trampoline ( ) ;
2006-12-07 04:14:02 +03:00
+ + cpucount ;
alternatives_smp_switch ( 1 ) ;
2005-04-17 02:20:36 +04:00
/* So we see what's up */
printk ( " Booting processor %d/%d eip %lx \n " , cpu , apicid , start_eip ) ;
/* Stack for startup_32 can be just as for start_secondary onwards */
stack_start . esp = ( void * ) idle - > thread . esp ;
irq_ctx_init ( cpu ) ;
2007-10-19 22:35:03 +04:00
per_cpu ( x86_cpu_to_apicid , cpu ) = apicid ;
2005-04-17 02:20:36 +04:00
/*
* This grunge runs the startup process for
* the targeted processor .
*/
atomic_set ( & init_deasserted , 0 ) ;
Dprintk ( " Setting warm reset code and vector. \n " ) ;
store_NMI_vector ( & nmi_high , & nmi_low ) ;
smpboot_setup_warm_reset_vector ( start_eip ) ;
/*
* Starting actual IPI sequence . . .
*/
boot_error = wakeup_secondary_cpu ( apicid , start_eip ) ;
if ( ! boot_error ) {
/*
* allow APs to start initializing .
*/
Dprintk ( " Before Callout %d. \n " , cpu ) ;
cpu_set ( cpu , cpu_callout_map ) ;
Dprintk ( " After Callout %d. \n " , cpu ) ;
/*
* Wait 5 s total for a response
*/
for ( timeout = 0 ; timeout < 50000 ; timeout + + ) {
if ( cpu_isset ( cpu , cpu_callin_map ) )
break ; /* It has booted */
udelay ( 100 ) ;
}
if ( cpu_isset ( cpu , cpu_callin_map ) ) {
/* number CPUs logically, starting from 1 (BSP is 0) */
Dprintk ( " OK. \n " ) ;
printk ( " CPU%d: " , cpu ) ;
2007-10-19 22:35:04 +04:00
print_cpu_info ( & cpu_data ( cpu ) ) ;
2005-04-17 02:20:36 +04:00
Dprintk ( " CPU has booted. \n " ) ;
} else {
boot_error = 1 ;
if ( * ( ( volatile unsigned char * ) trampoline_base )
= = 0xA5 )
/* trampoline started but...? */
printk ( " Stuck ?? \n " ) ;
else
/* trampoline code not run */
printk ( " Not responding. \n " ) ;
inquire_remote_apic ( apicid ) ;
}
}
2005-06-26 01:54:56 +04:00
2005-04-17 02:20:36 +04:00
if ( boot_error ) {
/* Try to put things back the way they were before ... */
unmap_cpu_to_logical_apicid ( cpu ) ;
cpu_clear ( cpu , cpu_callout_map ) ; /* was set here (do_boot_cpu()) */
cpu_clear ( cpu , cpu_initialized ) ; /* was set by cpu_init() */
cpucount - - ;
2005-06-26 01:54:56 +04:00
} else {
2007-10-19 22:35:03 +04:00
per_cpu ( x86_cpu_to_apicid , cpu ) = apicid ;
2005-06-26 01:54:56 +04:00
cpu_set ( cpu , cpu_present_map ) ;
2005-04-17 02:20:36 +04:00
}
/* mark "stuck" area as not stuck */
* ( ( volatile unsigned long * ) trampoline_base ) = 0 ;
return boot_error ;
}
2005-06-26 01:54:56 +04:00
# ifdef CONFIG_HOTPLUG_CPU
void cpu_exit_clear ( void )
{
int cpu = raw_smp_processor_id ( ) ;
idle_task_exit ( ) ;
cpucount - - ;
cpu_uninit ( ) ;
irq_ctx_exit ( cpu ) ;
cpu_clear ( cpu , cpu_callout_map ) ;
cpu_clear ( cpu , cpu_callin_map ) ;
cpu_clear ( cpu , smp_commenced_mask ) ;
unmap_cpu_to_logical_apicid ( cpu ) ;
}
struct warm_boot_cpu_info {
struct completion * complete ;
2006-11-22 17:57:56 +03:00
struct work_struct task ;
2005-06-26 01:54:56 +04:00
int apicid ;
int cpu ;
} ;
2006-11-22 17:57:56 +03:00
static void __cpuinit do_warm_boot_cpu ( struct work_struct * work )
2005-06-26 01:54:56 +04:00
{
2006-11-22 17:57:56 +03:00
struct warm_boot_cpu_info * info =
container_of ( work , struct warm_boot_cpu_info , task ) ;
2005-06-26 01:54:56 +04:00
do_boot_cpu ( info - > apicid , info - > cpu ) ;
complete ( info - > complete ) ;
}
2006-03-25 14:08:18 +03:00
static int __cpuinit __smp_prepare_cpu ( int cpu )
2005-06-26 01:54:56 +04:00
{
2006-10-01 10:28:10 +04:00
DECLARE_COMPLETION_ONSTACK ( done ) ;
2005-06-26 01:54:56 +04:00
struct warm_boot_cpu_info info ;
int apicid , ret ;
2007-10-19 22:35:03 +04:00
apicid = per_cpu ( x86_cpu_to_apicid , cpu ) ;
2005-06-26 01:54:56 +04:00
if ( apicid = = BAD_APICID ) {
ret = - ENODEV ;
goto exit ;
}
info . complete = & done ;
info . apicid = apicid ;
info . cpu = cpu ;
2006-11-22 17:57:56 +03:00
INIT_WORK ( & info . task , do_warm_boot_cpu ) ;
2005-06-26 01:54:56 +04:00
/* init low mem mapping */
2005-09-04 02:56:50 +04:00
clone_pgd_range ( swapper_pg_dir , swapper_pg_dir + USER_PGD_PTRS ,
2006-12-08 13:41:13 +03:00
min_t ( unsigned long , KERNEL_PGD_PTRS , USER_PGD_PTRS ) ) ;
2005-06-26 01:54:56 +04:00
flush_tlb_all ( ) ;
2006-11-22 17:57:56 +03:00
schedule_work ( & info . task ) ;
2005-06-26 01:54:56 +04:00
wait_for_completion ( & done ) ;
zap_low_mappings ( ) ;
ret = 0 ;
exit :
return ret ;
}
# endif
2005-04-17 02:20:36 +04:00
/*
* Cycle through the processors sending APIC IPIs to boot each .
*/
static int boot_cpu_logical_apicid ;
/* Where the IO area was mapped on multiquad, always 0 otherwise */
void * xquad_portio ;
2005-06-23 11:08:33 +04:00
# ifdef CONFIG_X86_NUMAQ
EXPORT_SYMBOL ( xquad_portio ) ;
# endif
2005-04-17 02:20:36 +04:00
static void __init smp_boot_cpus ( unsigned int max_cpus )
{
int apicid , cpu , bit , kicked ;
unsigned long bogosum = 0 ;
/*
* Setup boot CPU information
*/
smp_store_cpu_info ( 0 ) ; /* Final full version of the data */
printk ( " CPU%d: " , 0 ) ;
2007-10-19 22:35:04 +04:00
print_cpu_info ( & cpu_data ( 0 ) ) ;
2005-04-17 02:20:36 +04:00
2005-11-01 06:16:17 +03:00
boot_cpu_physical_apicid = GET_APIC_ID ( apic_read ( APIC_ID ) ) ;
2005-04-17 02:20:36 +04:00
boot_cpu_logical_apicid = logical_smp_processor_id ( ) ;
2007-10-19 22:35:03 +04:00
per_cpu ( x86_cpu_to_apicid , 0 ) = boot_cpu_physical_apicid ;
2005-04-17 02:20:36 +04:00
current_thread_info ( ) - > cpu = 0 ;
2005-11-05 19:25:54 +03:00
set_cpu_sibling_map ( 0 ) ;
2005-04-17 02:25:15 +04:00
2005-04-17 02:20:36 +04:00
/*
* If we couldn ' t find an SMP configuration at boot time ,
* get out of here now !
*/
if ( ! smp_found_config & & ! acpi_lapic ) {
printk ( KERN_NOTICE " SMP motherboard not detected. \n " ) ;
2005-11-01 06:16:17 +03:00
smpboot_clear_io_apic_irqs ( ) ;
phys_cpu_present_map = physid_mask_of_physid ( 0 ) ;
if ( APIC_init_uniprocessor ( ) )
printk ( KERN_NOTICE " Local APIC not detected. "
" Using dummy APIC emulation. \n " ) ;
map_cpu_to_logical_apicid ( ) ;
2007-10-16 12:24:05 +04:00
cpu_set ( 0 , per_cpu ( cpu_sibling_map , 0 ) ) ;
2007-10-16 12:24:04 +04:00
cpu_set ( 0 , per_cpu ( cpu_core_map , 0 ) ) ;
2005-11-01 06:16:17 +03:00
return ;
}
/*
* Should not be necessary because the MP table should list the boot
* CPU too , but we do it for the sake of robustness anyway .
* Makes no sense to do this check in clustered apic mode , so skip it
*/
if ( ! check_phys_apicid_present ( boot_cpu_physical_apicid ) ) {
printk ( " weird, boot CPU (#%d) not listed by the BIOS. \n " ,
boot_cpu_physical_apicid ) ;
physid_set ( hard_smp_processor_id ( ) , phys_cpu_present_map ) ;
}
/*
* If we couldn ' t find a local APIC , then get out of here now !
*/
if ( APIC_INTEGRATED ( apic_version [ boot_cpu_physical_apicid ] ) & & ! cpu_has_apic ) {
printk ( KERN_ERR " BIOS bug, local APIC #%d not detected!... \n " ,
boot_cpu_physical_apicid ) ;
printk ( KERN_ERR " ... forcing use of dummy APIC emulation. (tell your hw vendor) \n " ) ;
smpboot_clear_io_apic_irqs ( ) ;
phys_cpu_present_map = physid_mask_of_physid ( 0 ) ;
2007-10-19 22:35:02 +04:00
map_cpu_to_logical_apicid ( ) ;
2007-10-16 12:24:05 +04:00
cpu_set ( 0 , per_cpu ( cpu_sibling_map , 0 ) ) ;
2007-10-16 12:24:04 +04:00
cpu_set ( 0 , per_cpu ( cpu_core_map , 0 ) ) ;
2005-04-17 02:20:36 +04:00
return ;
}
2005-11-01 06:16:17 +03:00
verify_local_APIC ( ) ;
2005-04-17 02:20:36 +04:00
/*
* If SMP should be disabled , then really disable it !
*/
2005-11-01 06:16:17 +03:00
if ( ! max_cpus ) {
smp_found_config = 0 ;
printk ( KERN_INFO " SMP mode deactivated, forcing use of dummy APIC emulation. \n " ) ;
2007-10-17 20:04:34 +04:00
if ( nmi_watchdog = = NMI_LOCAL_APIC ) {
printk ( KERN_INFO " activating minimal APIC for NMI watchdog use. \n " ) ;
connect_bsp_APIC ( ) ;
setup_local_APIC ( ) ;
}
2005-11-01 06:16:17 +03:00
smpboot_clear_io_apic_irqs ( ) ;
phys_cpu_present_map = physid_mask_of_physid ( 0 ) ;
2007-10-19 22:35:02 +04:00
map_cpu_to_logical_apicid ( ) ;
2007-10-16 12:24:05 +04:00
cpu_set ( 0 , per_cpu ( cpu_sibling_map , 0 ) ) ;
2007-10-16 12:24:04 +04:00
cpu_set ( 0 , per_cpu ( cpu_core_map , 0 ) ) ;
2005-04-17 02:20:36 +04:00
return ;
}
2005-11-01 06:16:17 +03:00
connect_bsp_APIC ( ) ;
setup_local_APIC ( ) ;
map_cpu_to_logical_apicid ( ) ;
2005-04-17 02:20:36 +04:00
setup_portio_remap ( ) ;
/*
* Scan the CPU present map and fire up the other CPUs via do_boot_cpu
*
* In clustered apic mode , phys_cpu_present_map is a constructed thus :
* bits 0 - 3 are quad0 , 4 - 7 are quad1 , etc . A perverse twist on the
* clustered apic ID .
*/
Dprintk ( " CPU present map: %lx \n " , physids_coerce ( phys_cpu_present_map ) ) ;
kicked = 1 ;
for ( bit = 0 ; kicked < NR_CPUS & & bit < MAX_APICS ; bit + + ) {
apicid = cpu_present_to_apicid ( bit ) ;
/*
* Don ' t even attempt to start the boot CPU !
*/
if ( ( apicid = = boot_cpu_apicid ) | | ( apicid = = BAD_APICID ) )
continue ;
if ( ! check_apicid_present ( bit ) )
continue ;
if ( max_cpus < = cpucount + 1 )
continue ;
2005-06-26 01:54:56 +04:00
if ( ( ( cpu = alloc_cpu_id ( ) ) < = 0 ) | | do_boot_cpu ( apicid , cpu ) )
2005-04-17 02:20:36 +04:00
printk ( " CPU #%d not responding - cannot use it. \n " ,
apicid ) ;
else
+ + kicked ;
}
/*
* Cleanup possible dangling ends . . .
*/
smpboot_restore_warm_reset_vector ( ) ;
/*
* Allow the user to impress friends .
*/
Dprintk ( " Before bogomips. \n " ) ;
for ( cpu = 0 ; cpu < NR_CPUS ; cpu + + )
if ( cpu_isset ( cpu , cpu_callout_map ) )
2007-10-19 22:35:04 +04:00
bogosum + = cpu_data ( cpu ) . loops_per_jiffy ;
2005-04-17 02:20:36 +04:00
printk ( KERN_INFO
" Total of %d processors activated (%lu.%02lu BogoMIPS). \n " ,
cpucount + 1 ,
bogosum / ( 500000 / HZ ) ,
( bogosum / ( 5000 / HZ ) ) % 100 ) ;
Dprintk ( " Before bogocount - setting activated=1. \n " ) ;
if ( smp_b_stepping )
printk ( KERN_WARNING " WARNING: SMP operation may be unreliable with B stepping processors. \n " ) ;
/*
* Don ' t taint if we are running SMP kernel on a single non - MP
* approved Athlon
*/
if ( tainted & TAINT_UNSAFE_SMP ) {
if ( cpucount )
printk ( KERN_INFO " WARNING: This combination of AMD processors is not suitable for SMP. \n " ) ;
else
tainted & = ~ TAINT_UNSAFE_SMP ;
}
Dprintk ( " Boot done. \n " ) ;
/*
2007-10-16 12:24:05 +04:00
* construct cpu_sibling_map , so that we can tell sibling CPUs
2005-04-17 02:20:36 +04:00
* efficiently .
*/
2005-04-17 02:25:15 +04:00
for ( cpu = 0 ; cpu < NR_CPUS ; cpu + + ) {
2007-10-16 12:24:05 +04:00
cpus_clear ( per_cpu ( cpu_sibling_map , cpu ) ) ;
2007-10-16 12:24:04 +04:00
cpus_clear ( per_cpu ( cpu_core_map , cpu ) ) ;
2005-04-17 02:25:15 +04:00
}
2005-04-17 02:20:36 +04:00
2007-10-16 12:24:05 +04:00
cpu_set ( 0 , per_cpu ( cpu_sibling_map , 0 ) ) ;
2007-10-16 12:24:04 +04:00
cpu_set ( 0 , per_cpu ( cpu_core_map , 0 ) ) ;
2005-04-17 02:20:36 +04:00
2005-11-01 06:16:17 +03:00
smpboot_setup_io_apic ( ) ;
2007-02-13 15:26:21 +03:00
setup_boot_clock ( ) ;
2005-04-17 02:20:36 +04:00
}
/* These are wrappers to interface to the new boot process. Someone
who understands all this stuff should rewrite it properly . - - RR 15 / Jul / 02 */
2007-05-02 21:27:11 +04:00
void __init native_smp_prepare_cpus ( unsigned int max_cpus )
2005-04-17 02:20:36 +04:00
{
2005-06-26 01:54:50 +04:00
smp_commenced_mask = cpumask_of_cpu ( 0 ) ;
cpu_callin_map = cpumask_of_cpu ( 0 ) ;
mb ( ) ;
2005-04-17 02:20:36 +04:00
smp_boot_cpus ( max_cpus ) ;
}
2007-05-02 21:27:11 +04:00
void __init native_smp_prepare_boot_cpu ( void )
2007-05-02 21:27:10 +04:00
{
unsigned int cpu = smp_processor_id ( ) ;
2007-05-02 21:27:16 +04:00
init_gdt ( cpu ) ;
2007-05-02 21:27:10 +04:00
switch_to_new_gdt ( ) ;
cpu_set ( cpu , cpu_online_map ) ;
cpu_set ( cpu , cpu_callout_map ) ;
cpu_set ( cpu , cpu_present_map ) ;
cpu_set ( cpu , cpu_possible_map ) ;
__get_cpu_var ( cpu_state ) = CPU_ONLINE ;
2005-04-17 02:20:36 +04:00
}
2005-06-26 01:54:50 +04:00
# ifdef CONFIG_HOTPLUG_CPU
2007-07-18 05:37:03 +04:00
void remove_siblinginfo ( int cpu )
2005-04-17 02:20:36 +04:00
{
2005-06-26 01:54:56 +04:00
int sibling ;
2007-10-19 22:35:04 +04:00
struct cpuinfo_x86 * c = & cpu_data ( cpu ) ;
2005-06-26 01:54:56 +04:00
2007-10-16 12:24:04 +04:00
for_each_cpu_mask ( sibling , per_cpu ( cpu_core_map , cpu ) ) {
cpu_clear ( cpu , per_cpu ( cpu_core_map , sibling ) ) ;
/*/
2005-11-05 19:25:54 +03:00
* last thread sibling in this cpu core going down
*/
2007-10-16 12:24:05 +04:00
if ( cpus_weight ( per_cpu ( cpu_sibling_map , cpu ) ) = = 1 )
2007-10-19 22:35:04 +04:00
cpu_data ( sibling ) . booted_cores - - ;
2005-11-05 19:25:54 +03:00
}
2007-10-16 12:24:05 +04:00
for_each_cpu_mask ( sibling , per_cpu ( cpu_sibling_map , cpu ) )
cpu_clear ( cpu , per_cpu ( cpu_sibling_map , sibling ) ) ;
cpus_clear ( per_cpu ( cpu_sibling_map , cpu ) ) ;
2007-10-16 12:24:04 +04:00
cpus_clear ( per_cpu ( cpu_core_map , cpu ) ) ;
2007-10-19 22:35:04 +04:00
c - > phys_proc_id = 0 ;
c - > cpu_core_id = 0 ;
2005-11-05 19:25:54 +03:00
cpu_clear ( cpu , cpu_sibling_setup_map ) ;
2005-06-26 01:54:50 +04:00
}
int __cpu_disable ( void )
{
cpumask_t map = cpu_online_map ;
int cpu = smp_processor_id ( ) ;
/*
* Perhaps use cpufreq to drop frequency , but that could go
* into generic code .
*
* We won ' t take down the boot processor on i386 due to some
* interrupts only being able to be serviced by the BSP .
* Especially so if we ' re not using an IOAPIC - zwane
*/
if ( cpu = = 0 )
return - EBUSY ;
2006-09-26 12:52:27 +04:00
if ( nmi_watchdog = = NMI_LOCAL_APIC )
stop_apic_nmi_watchdog ( NULL ) ;
2005-12-13 09:17:08 +03:00
clear_local_APIC ( ) ;
2005-06-26 01:54:50 +04:00
/* Allow any queued timer interrupts to get serviced */
local_irq_enable ( ) ;
mdelay ( 1 ) ;
local_irq_disable ( ) ;
2005-06-26 01:54:56 +04:00
remove_siblinginfo ( cpu ) ;
2005-06-26 01:54:50 +04:00
cpu_clear ( cpu , map ) ;
fixup_irqs ( map ) ;
/* It's now safe to remove this processor from the online map */
cpu_clear ( cpu , cpu_online_map ) ;
return 0 ;
}
void __cpu_die ( unsigned int cpu )
{
/* We don't do anything here: idle task is faking death itself. */
unsigned int i ;
for ( i = 0 ; i < 10 ; i + + ) {
/* They ack this in play_dead by setting CPU_DEAD */
2005-06-26 01:54:56 +04:00
if ( per_cpu ( cpu_state , cpu ) = = CPU_DEAD ) {
printk ( " CPU %d is now offline \n " , cpu ) ;
2006-03-23 13:59:32 +03:00
if ( 1 = = num_online_cpus ( ) )
alternatives_smp_switch ( 0 ) ;
2005-06-26 01:54:50 +04:00
return ;
2005-06-26 01:54:56 +04:00
}
2005-09-10 11:26:50 +04:00
msleep ( 100 ) ;
2005-04-17 02:20:36 +04:00
}
2005-06-26 01:54:50 +04:00
printk ( KERN_ERR " CPU %u didn't die... \n " , cpu ) ;
}
# else /* ... !CONFIG_HOTPLUG_CPU */
int __cpu_disable ( void )
{
return - ENOSYS ;
}
2005-04-17 02:20:36 +04:00
2005-06-26 01:54:50 +04:00
void __cpu_die ( unsigned int cpu )
{
/* We said "no" in __cpu_disable */
BUG ( ) ;
}
# endif /* CONFIG_HOTPLUG_CPU */
2007-05-02 21:27:11 +04:00
int __cpuinit native_cpu_up ( unsigned int cpu )
2005-06-26 01:54:50 +04:00
{
2007-03-07 20:12:31 +03:00
unsigned long flags ;
2006-03-25 14:08:18 +03:00
# ifdef CONFIG_HOTPLUG_CPU
2007-03-07 20:12:31 +03:00
int ret = 0 ;
2006-03-25 14:08:18 +03:00
/*
* We do warm boot only on cpus that had booted earlier
* Otherwise cold boot is all handled from smp_boot_cpus ( ) .
* cpu_callin_map is set during AP kickstart process . Its reset
* when a cpu is taken offline from cpu_exit_clear ( ) .
*/
if ( ! cpu_isset ( cpu , cpu_callin_map ) )
ret = __smp_prepare_cpu ( cpu ) ;
if ( ret )
return - EIO ;
# endif
2005-04-17 02:20:36 +04:00
/* In case one didn't come up */
if ( ! cpu_isset ( cpu , cpu_callin_map ) ) {
2005-06-26 01:54:50 +04:00
printk ( KERN_DEBUG " skipping cpu%d, didn't come online \n " , cpu ) ;
2005-04-17 02:20:36 +04:00
return - EIO ;
}
2005-06-26 01:54:56 +04:00
per_cpu ( cpu_state , cpu ) = CPU_UP_PREPARE ;
2005-04-17 02:20:36 +04:00
/* Unleash the CPU! */
cpu_set ( cpu , smp_commenced_mask ) ;
2007-02-16 12:27:34 +03:00
/*
2007-03-07 20:12:31 +03:00
* Check TSC synchronization with the AP ( keep irqs disabled
* while doing so ) :
2007-02-16 12:27:34 +03:00
*/
2007-03-07 20:12:31 +03:00
local_irq_save ( flags ) ;
2007-02-16 12:27:34 +03:00
check_tsc_sync_source ( cpu ) ;
2007-03-07 20:12:31 +03:00
local_irq_restore ( flags ) ;
2007-02-16 12:27:34 +03:00
2007-03-07 20:12:31 +03:00
while ( ! cpu_isset ( cpu , cpu_online_map ) ) {
2006-06-25 16:46:52 +04:00
cpu_relax ( ) ;
2007-03-07 20:12:31 +03:00
touch_nmi_watchdog ( ) ;
}
2006-12-07 04:14:10 +03:00
2005-04-17 02:20:36 +04:00
return 0 ;
}
2007-05-02 21:27:11 +04:00
void __init native_smp_cpus_done ( unsigned int max_cpus )
2005-04-17 02:20:36 +04:00
{
# ifdef CONFIG_X86_IO_APIC
setup_ioapic_dest ( ) ;
# endif
zap_low_mappings ( ) ;
2005-06-26 01:54:56 +04:00
# ifndef CONFIG_HOTPLUG_CPU
2005-04-17 02:20:36 +04:00
/*
* Disable executability of the SMP trampoline :
*/
set_kernel_exec ( ( unsigned long ) trampoline_base , trampoline_exec ) ;
2005-06-26 01:54:56 +04:00
# endif
2005-04-17 02:20:36 +04:00
}
void __init smp_intr_init ( void )
{
/*
* IRQ0 must be given a fixed assignment and initialized ,
* because it ' s used before the IO - APIC is set up .
*/
set_intr_gate ( FIRST_DEVICE_VECTOR , interrupt [ 0 ] ) ;
/*
* The reschedule interrupt is a CPU - to - CPU reschedule - helper
* IPI , driven by wakeup .
*/
set_intr_gate ( RESCHEDULE_VECTOR , reschedule_interrupt ) ;
/* IPI for invalidation */
set_intr_gate ( INVALIDATE_TLB_VECTOR , invalidate_interrupt ) ;
/* IPI for generic function call */
set_intr_gate ( CALL_FUNCTION_VECTOR , call_function_interrupt ) ;
}
2006-09-26 12:52:32 +04:00
/*
* If the BIOS enumerates physical processors before logical ,
* maxcpus = N at enumeration - time can be used to disable HT .
*/
static int __init parse_maxcpus ( char * arg )
{
extern unsigned int maxcpus ;
maxcpus = simple_strtoul ( arg , NULL , 0 ) ;
return 0 ;
}
early_param ( " maxcpus " , parse_maxcpus ) ;