2019-05-27 08:55:01 +02:00
// SPDX-License-Identifier: GPL-2.0-or-later
2005-09-26 16:04:21 +10:00
/*
* Copyright ( C ) 1995 - 1996 Gary Thomas ( gdt @ linuxppc . org )
2010-04-08 00:38:22 -05:00
* Copyright 2007 - 2010 Freescale Semiconductor , Inc .
2005-09-26 16:04:21 +10:00
*
* Modified by Cort Dougan ( cort @ cs . nmt . edu )
* and Paul Mackerras ( paulus @ samba . org )
*/
/*
* This file handles the architecture - dependent parts of hardware exceptions
*/
# include <linux/errno.h>
# include <linux/sched.h>
2017-02-08 18:51:35 +01:00
# include <linux/sched/debug.h>
2005-09-26 16:04:21 +10:00
# include <linux/kernel.h>
# include <linux/mm.h>
2018-01-18 17:50:42 -08:00
# include <linux/pkeys.h>
2005-09-26 16:04:21 +10:00
# include <linux/stddef.h>
# include <linux/unistd.h>
2005-10-06 13:27:05 +10:00
# include <linux/ptrace.h>
2005-09-26 16:04:21 +10:00
# include <linux/user.h>
# include <linux/interrupt.h>
# include <linux/init.h>
2016-08-16 10:57:34 -04:00
# include <linux/extable.h>
# include <linux/module.h> /* print_modules */
2005-10-06 13:27:05 +10:00
# include <linux/prctl.h>
2005-09-26 16:04:21 +10:00
# include <linux/delay.h>
# include <linux/kprobes.h>
2005-12-04 18:39:43 +11:00
# include <linux/kexec.h>
2006-06-25 05:47:08 -07:00
# include <linux/backlight.h>
2006-12-08 03:30:41 -08:00
# include <linux/bug.h>
2007-05-08 00:27:03 -07:00
# include <linux/kdebug.h>
2011-06-04 05:36:54 +00:00
# include <linux/ratelimit.h>
powerpc: Exception hooks for context tracking subsystem
This is the exception hooks for context tracking subsystem, including
data access, program check, single step, instruction breakpoint, machine check,
alignment, fp unavailable, altivec assist, unknown exception, whose handlers
might use RCU.
This patch corresponds to
[PATCH] x86: Exception hooks for userspace RCU extended QS
commit 6ba3c97a38803883c2eee489505796cb0a727122
But after the exception handling moved to generic code, and some changes in
following two commits:
56dd9470d7c8734f055da2a6bac553caf4a468eb
context_tracking: Move exception handling to generic code
6c1e0256fad84a843d915414e4b5973b7443d48d
context_tracking: Restore correct previous context state on exception exit
it is able for exception hooks to use the generic code above instead of a
redundant arch implementation.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-13 16:16:41 +00:00
# include <linux/context_tracking.h>
2017-09-15 15:25:48 +10:00
# include <linux/smp.h>
powerpc/pseries, ps3: panic flush kernel messages before halting system
Platforms with a panic handler that halts the system can have problems
getting kernel messages out, because the panic notifiers are called
before kernel/panic.c does its flushing of printk buffers an console
etc.
This was attempted to be solved with commit a3b2cb30f252 ("powerpc: Do
not call ppc_md.panic in fadump panic notifier"), but that wasn't the
right approach and caused other problems, and was reverted by commit
ab9dbf771ff9.
Instead, the powernv shutdown paths have already had a similar
problem, fixed by taking the message flushing sequence from
kernel/panic.c. That's a little bit ugly, but while we have the code
duplicated, it will work for this case as well. So have ppc panic
handlers do the same flushing before they terminate.
Without this patch, a qemu pseries_le_defconfig guest stops silently
when issued the nmi command when xmon is off and no crash dumpers
enabled. Afterwards, an oops is printed by each CPU as expected.
Fixes: ab9dbf771ff9 ("Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier"")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-24 02:49:23 +10:00
# include <linux/console.h>
# include <linux/kmsg_dump.h>
2021-08-12 18:58:31 +05:30
# include <linux/debugfs.h>
2005-09-26 16:04:21 +10:00
2009-05-18 02:10:05 +00:00
# include <asm/emulated_ops.h>
2016-12-24 11:46:01 -08:00
# include <linux/uaccess.h>
2021-01-30 23:08:38 +10:00
# include <asm/interrupt.h>
2005-09-26 16:04:21 +10:00
# include <asm/io.h>
2005-10-10 22:37:57 +10:00
# include <asm/machdep.h>
# include <asm/rtas.h>
2005-10-19 14:53:32 +10:00
# include <asm/pmc.h>
2005-09-26 16:04:21 +10:00
# include <asm/reg.h>
# ifdef CONFIG_PMAC_BACKLIGHT
# include <asm/backlight.h>
# endif
2005-10-01 18:43:42 +10:00
# ifdef CONFIG_PPC64
2005-10-10 22:37:57 +10:00
# include <asm/firmware.h>
2005-10-01 18:43:42 +10:00
# include <asm/processor.h>
# endif
2006-06-23 15:29:34 -07:00
# include <asm/kexec.h>
2009-02-10 20:10:44 +00:00
# include <asm/ppc-opcode.h>
2010-11-18 14:57:32 +08:00
# include <asm/rio.h>
2012-02-16 01:14:45 +00:00
# include <asm/fadump.h>
2012-03-28 18:30:02 +01:00
# include <asm/switch_to.h>
2013-02-13 16:21:39 +00:00
# include <asm/tm.h>
2012-03-28 18:30:02 +01:00
# include <asm/debug.h>
2016-05-18 11:16:50 +10:00
# include <asm/asm-prototypes.h>
KVM: PPC: Book3S HV: Fix TB corruption in guest exit path on HMI interrupt
When a guest is assigned to a core it converts the host Timebase (TB)
into guest TB by adding guest timebase offset before entering into
guest. During guest exit it restores the guest TB to host TB. This means
under certain conditions (Guest migration) host TB and guest TB can differ.
When we get an HMI for TB related issues the opal HMI handler would
try fixing errors and restore the correct host TB value. With no guest
running, we don't have any issues. But with guest running on the core
we run into TB corruption issues.
If we get an HMI while in the guest, the current HMI handler invokes opal
hmi handler before forcing guest to exit. The guest exit path subtracts
the guest TB offset from the current TB value which may have already
been restored with host value by opal hmi handler. This leads to incorrect
host and guest TB values.
With split-core, things become more complex. With split-core, TB also gets
split and each subcore gets its own TB register. When a hmi handler fixes
a TB error and restores the TB value, it affects all the TB values of
sibling subcores on the same core. On TB errors all the thread in the core
gets HMI. With existing code, the individual threads call opal hmi handle
independently which can easily throw TB out of sync if we have guest
running on subcores. Hence we will need to co-ordinate with all the
threads before making opal hmi handler call followed by TB resync.
This patch introduces a sibling subcore state structure (shared by all
threads in the core) in paca which holds information about whether sibling
subcores are in Guest mode or host mode. An array in_guest[] of size
MAX_SUBCORE_PER_CORE=4 is used to maintain the state of each subcore.
The subcore id is used as index into in_guest[] array. Only primary
thread entering/exiting the guest is responsible to set/unset its
designated array element.
On TB error, we get HMI interrupt on every thread on the core. Upon HMI,
this patch will now force guest to vacate the core/subcore. Primary
thread from each subcore will then turn off its respective bit
from the above bitmap during the guest exit path just after the
guest->host partition switch is complete.
All other threads that have just exited the guest OR were already in host
will wait until all other subcores clears their respective bit.
Once all the subcores turn off their respective bit, all threads will
will make call to opal hmi handler.
It is not necessary that opal hmi handler would resync the TB value for
every HMI interrupts. It would do so only for the HMI caused due to
TB errors. For rest, it would not touch TB value. Hence to make things
simpler, primary thread would call TB resync explicitly once for each
core immediately after opal hmi handler instead of subtracting guest
offset from TB. TB resync call will restore the TB with host value.
Thus we can be sure about the TB state.
One of the primary threads exiting the guest will take up the
responsibility of calling TB resync. It will use one of the top bits
(bit 63) from subcore state flags bitmap to make the decision. The first
primary thread (among the subcores) that is able to set the bit will
have to call the TB resync. Rest all other threads will wait until TB
resync is complete. Once TB resync is complete all threads will then
proceed.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
2016-05-15 09:44:26 +05:30
# include <asm/hmi.h>
2013-04-28 13:20:08 +08:00
# include <sysdev/fsl_pci.h>
2016-11-21 22:36:41 +05:30
# include <asm/kprobes.h>
2018-08-01 18:33:20 -03:00
# include <asm/stacktrace.h>
2019-03-12 21:18:23 +01:00
# include <asm/nmi.h>
2021-05-20 10:23:10 +00:00
# include <asm/disassemble.h>
2005-10-01 18:43:42 +10:00
2016-11-29 23:45:50 +11:00
# if defined(CONFIG_DEBUGGER) || defined(CONFIG_KEXEC_CORE)
2010-01-12 00:50:14 +00:00
int ( * __debugger ) ( struct pt_regs * regs ) __read_mostly ;
int ( * __debugger_ipi ) ( struct pt_regs * regs ) __read_mostly ;
int ( * __debugger_bpt ) ( struct pt_regs * regs ) __read_mostly ;
int ( * __debugger_sstep ) ( struct pt_regs * regs ) __read_mostly ;
int ( * __debugger_iabr_match ) ( struct pt_regs * regs ) __read_mostly ;
2012-12-20 14:06:44 +00:00
int ( * __debugger_break_match ) ( struct pt_regs * regs ) __read_mostly ;
2010-01-12 00:50:14 +00:00
int ( * __debugger_fault_handler ) ( struct pt_regs * regs ) __read_mostly ;
2005-09-26 16:04:21 +10:00
EXPORT_SYMBOL ( __debugger ) ;
EXPORT_SYMBOL ( __debugger_ipi ) ;
EXPORT_SYMBOL ( __debugger_bpt ) ;
EXPORT_SYMBOL ( __debugger_sstep ) ;
EXPORT_SYMBOL ( __debugger_iabr_match ) ;
2012-12-20 14:06:44 +00:00
EXPORT_SYMBOL ( __debugger_break_match ) ;
2005-09-26 16:04:21 +10:00
EXPORT_SYMBOL ( __debugger_fault_handler ) ;
# endif
2013-02-13 16:21:32 +00:00
/* Transactional Memory trap debug */
# ifdef TM_DEBUG_SW
# define TM_DEBUG(x...) printk(KERN_INFO x)
# else
# define TM_DEBUG(x...) do { } while(0)
# endif
2018-08-01 18:33:18 -03:00
static const char * signame ( int signr )
{
switch ( signr ) {
case SIGBUS : return " bus error " ;
case SIGFPE : return " floating point exception " ;
case SIGILL : return " illegal instruction " ;
case SIGSEGV : return " segfault " ;
case SIGTRAP : return " unhandled trap " ;
}
return " unknown signal " ;
}
2005-09-26 16:04:21 +10:00
/*
* Trap & Exception support
*/
2007-03-20 20:38:12 -05:00
# ifdef CONFIG_PMAC_BACKLIGHT
static void pmac_backlight_unblank ( void )
{
mutex_lock ( & pmac_backlight_mutex ) ;
if ( pmac_backlight ) {
struct backlight_properties * props ;
props = & pmac_backlight - > props ;
props - > brightness = props - > max_brightness ;
props - > power = FB_BLANK_UNBLANK ;
backlight_update_status ( pmac_backlight ) ;
}
mutex_unlock ( & pmac_backlight_mutex ) ;
}
# else
static inline void pmac_backlight_unblank ( void ) { }
# endif
powerpc/powernv: Use kernel crash path for machine checks
There are quite a few machine check exceptions that can be caused by
kernel bugs. To make debugging easier, use the kernel crash path in
cases of synchronous machine checks that occur in kernel mode, if that
would not result in the machine going straight to panic or crash dump.
There is a downside here that die()ing the process in kernel mode can
still leave the system unstable. panic_on_oops will always force the
system to fail-stop, so systems where that behaviour is important will
still do the right thing.
As a test, when triggering an i-side 0111b error (ifetch from foreign
address) in kernel mode process context on POWER9, the kernel currently
dies quickly like this:
Severe Machine check interrupt [Not recovered]
NIP [ffff000000000000]: 0xffff000000000000
Initiator: CPU
Error type: Real address [Instruction fetch (foreign)]
[ 127.426651616,0] OPAL: Reboot requested due to Platform error.
Effective[ 127.426693712,3] OPAL: Reboot requested due to Platform error. address: ffff000000000000
opal: Reboot type 1 not supported
Kernel panic - not syncing: PowerNV Unrecovered Machine Check
CPU: 56 PID: 4425 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a261072-dirty #35
Call Trace:
[ 128.017988928,4] IPMI: BUG: Dropping ESEL on the floor due to
buggy/mising code in OPAL for this BMC
Rebooting in 10 seconds..
Trying to free IRQ 496 from IRQ context!
After this patch, the process is killed and the kernel continues with
this message, which gives enough information to identify the offending
branch (i.e., with CFAR):
Severe Machine check interrupt [Not recovered]
NIP [ffff000000000000]: 0xffff000000000000
Initiator: CPU
Error type: Real address [Instruction fetch (foreign)]
Effective address: ffff000000000000
Oops: Machine check, sig: 7 [#1]
SMP NR_CPUS=2048
NUMA
PowerNV
Modules linked in: iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 ...
CPU: 22 PID: 4436 Comm: syscall Tainted: G M 4.12.0-rc1-13857-ga4700a261072-dirty #36
task: c000000932300000 task.stack: c000000932380000
NIP: ffff000000000000 LR: 00000000217706a4 CTR: ffff000000000000
REGS: c00000000fc8fd80 TRAP: 0200 Tainted: G M (4.12.0-rc1-13857-ga4700a261072-dirty)
MSR: 90000000001c1003 <SF,HV,ME,RI,LE>
CR: 24000484 XER: 20000000
CFAR: c000000000004c80 DAR: 0000000021770a90 DSISR: 0a000000 SOFTE: 1
GPR00: 0000000000001ebe 00007fffce4818b0 0000000021797f00 0000000000000000
GPR04: 00007fff8007ac24 0000000044000484 0000000000004000 00007fff801405e8
GPR08: 900000000280f033 0000000024000484 0000000000000000 0000000000000030
GPR12: 9000000000001003 00007fff801bc370 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR28: 00007fff801b0000 0000000000000000 00000000217707a0 00007fffce481918
NIP [ffff000000000000] 0xffff000000000000
LR [00000000217706a4] 0x217706a4
Call Trace:
Instruction dump:
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-07-19 16:59:11 +10:00
/*
* If oops / die is expected to crash the machine , return true here .
*
* This should not be expected to be 100 % accurate , there may be
* notifiers registered or other unexpected conditions that may bring
* down the kernel . Or if the current process in the kernel is holding
* locks or has other critical state , the kernel may become effectively
* unusable anyway .
*/
bool die_will_crash ( void )
{
if ( should_fadump_crash ( ) )
return true ;
if ( kexec_should_crash ( current ) )
return true ;
if ( in_interrupt ( ) | | panic_on_oops | |
! current - > pid | | is_global_init ( current ) )
return true ;
return false ;
}
2011-11-30 00:23:13 +00:00
static arch_spinlock_t die_lock = __ARCH_SPIN_LOCK_UNLOCKED ;
static int die_owner = - 1 ;
static unsigned int die_nest_count ;
static int die_counter ;
powerpc/pseries, ps3: panic flush kernel messages before halting system
Platforms with a panic handler that halts the system can have problems
getting kernel messages out, because the panic notifiers are called
before kernel/panic.c does its flushing of printk buffers an console
etc.
This was attempted to be solved with commit a3b2cb30f252 ("powerpc: Do
not call ppc_md.panic in fadump panic notifier"), but that wasn't the
right approach and caused other problems, and was reverted by commit
ab9dbf771ff9.
Instead, the powernv shutdown paths have already had a similar
problem, fixed by taking the message flushing sequence from
kernel/panic.c. That's a little bit ugly, but while we have the code
duplicated, it will work for this case as well. So have ppc panic
handlers do the same flushing before they terminate.
Without this patch, a qemu pseries_le_defconfig guest stops silently
when issued the nmi command when xmon is off and no crash dumpers
enabled. Afterwards, an oops is printed by each CPU as expected.
Fixes: ab9dbf771ff9 ("Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier"")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-24 02:49:23 +10:00
extern void panic_flush_kmsg_start ( void )
{
/*
* These are mostly taken from kernel / panic . c , but tries to do
* relatively minimal work . Don ' t use delay functions ( TB may
* be broken ) , don ' t crash dump ( need to set a firmware log ) ,
* don ' t run notifiers . We do want to get some information to
* Linux console .
*/
console_verbose ( ) ;
bust_spinlocks ( 1 ) ;
}
extern void panic_flush_kmsg_end ( void )
{
printk_safe_flush_on_panic ( ) ;
kmsg_dump ( KMSG_DUMP_PANIC ) ;
bust_spinlocks ( 0 ) ;
debug_locks_off ( ) ;
2019-05-17 14:31:50 -07:00
console_flush_on_panic ( CONSOLE_FLUSH_PENDING ) ;
powerpc/pseries, ps3: panic flush kernel messages before halting system
Platforms with a panic handler that halts the system can have problems
getting kernel messages out, because the panic notifiers are called
before kernel/panic.c does its flushing of printk buffers an console
etc.
This was attempted to be solved with commit a3b2cb30f252 ("powerpc: Do
not call ppc_md.panic in fadump panic notifier"), but that wasn't the
right approach and caused other problems, and was reverted by commit
ab9dbf771ff9.
Instead, the powernv shutdown paths have already had a similar
problem, fixed by taking the message flushing sequence from
kernel/panic.c. That's a little bit ugly, but while we have the code
duplicated, it will work for this case as well. So have ppc panic
handlers do the same flushing before they terminate.
Without this patch, a qemu pseries_le_defconfig guest stops silently
when issued the nmi command when xmon is off and no crash dumpers
enabled. Afterwards, an oops is printed by each CPU as expected.
Fixes: ab9dbf771ff9 ("Revert "powerpc: Do not call ppc_md.panic in fadump panic notifier"")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-12-24 02:49:23 +10:00
}
2016-09-16 20:48:08 +10:00
static unsigned long oops_begin ( struct pt_regs * regs )
2005-09-26 16:04:21 +10:00
{
2011-11-30 00:23:13 +00:00
int cpu ;
2007-03-20 20:38:13 -05:00
unsigned long flags ;
2005-09-26 16:04:21 +10:00
2007-03-20 20:38:11 -05:00
oops_enter ( ) ;
2011-11-30 00:23:13 +00:00
/* racy, but better than risking deadlock. */
raw_local_irq_save ( flags ) ;
cpu = smp_processor_id ( ) ;
if ( ! arch_spin_trylock ( & die_lock ) ) {
if ( cpu = = die_owner )
/* nested oops. should stop eventually */ ;
else
arch_spin_lock ( & die_lock ) ;
2007-03-20 20:38:13 -05:00
}
2011-11-30 00:23:13 +00:00
die_nest_count + + ;
die_owner = cpu ;
console_verbose ( ) ;
bust_spinlocks ( 1 ) ;
if ( machine_is ( powermac ) )
pmac_backlight_unblank ( ) ;
return flags ;
}
2016-09-16 20:48:08 +10:00
NOKPROBE_SYMBOL ( oops_begin ) ;
2006-03-28 23:15:54 +11:00
2016-09-16 20:48:08 +10:00
static void oops_end ( unsigned long flags , struct pt_regs * regs ,
2011-11-30 00:23:13 +00:00
int signr )
{
2005-09-26 16:04:21 +10:00
bust_spinlocks ( 0 ) ;
2013-01-21 17:17:39 +10:30
add_taint ( TAINT_DIE , LOCKDEP_NOW_UNRELIABLE ) ;
2011-11-30 00:23:13 +00:00
die_nest_count - - ;
2011-11-30 00:23:09 +00:00
oops_exit ( ) ;
printk ( " \n " ) ;
2016-11-08 23:14:45 +11:00
if ( ! die_nest_count ) {
2011-11-30 00:23:13 +00:00
/* Nest count reaches zero, release the lock. */
2016-11-08 23:14:45 +11:00
die_owner = - 1 ;
2011-11-30 00:23:13 +00:00
arch_spin_unlock ( & die_lock ) ;
2016-11-08 23:14:45 +11:00
}
2011-11-30 00:23:13 +00:00
raw_local_irq_restore ( flags ) ;
2005-12-04 18:39:43 +11:00
powerpc/64s: sreset panic if there is no debugger or crash dump handlers
system_reset_exception does most of its own crash handling now,
invoking the debugger or crash dumps if they are registered. If not,
then it goes through to die() to print stack traces, and then is
supposed to panic (according to comments).
However after die() prints oopses, it does its own handling which
doesn't allow system_reset_exception to panic (e.g., it may just
kill the current process). This patch causes sreset exceptions to
return from die after it prints messages but before acting.
This also stops die from invoking the debugger on 0x100 crashes.
system_reset_exception similarly calls the debugger. It had been
thought this was harmless (because if the debugger was disabled,
neither call would fire, and if it was enabled the first call
would return). However in some cases like xmon 'X' command, the
debugger returns 0, which currently causes it to be entered
again (first in system_reset_exception, then in die), which is
confusing.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-27 01:01:16 +10:00
/*
* system_reset_excption handles debugger , crash dump , panic , for 0x100
*/
2021-04-14 19:00:33 +08:00
if ( TRAP ( regs ) = = INTERRUPT_SYSTEM_RESET )
powerpc/64s: sreset panic if there is no debugger or crash dump handlers
system_reset_exception does most of its own crash handling now,
invoking the debugger or crash dumps if they are registered. If not,
then it goes through to die() to print stack traces, and then is
supposed to panic (according to comments).
However after die() prints oopses, it does its own handling which
doesn't allow system_reset_exception to panic (e.g., it may just
kill the current process). This patch causes sreset exceptions to
return from die after it prints messages but before acting.
This also stops die from invoking the debugger on 0x100 crashes.
system_reset_exception similarly calls the debugger. It had been
thought this was harmless (because if the debugger was disabled,
neither call would fire, and if it was enabled the first call
would return). However in some cases like xmon 'X' command, the
debugger returns 0, which currently causes it to be entered
again (first in system_reset_exception, then in die), which is
confusing.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-27 01:01:16 +10:00
return ;
2012-02-16 01:14:45 +00:00
crash_fadump ( regs , " die oops " ) ;
2017-07-05 13:56:27 +10:00
if ( kexec_should_crash ( current ) )
2005-12-04 18:39:43 +11:00
crash_kexec ( regs ) ;
2011-11-30 00:23:10 +00:00
2011-11-30 00:23:13 +00:00
if ( ! signr )
return ;
2011-11-30 00:23:09 +00:00
/*
* While our oops output is serialised by a spinlock , output
* from panic ( ) called below can race and corrupt it . If we
* know we are going to panic , delay for 1 second so we have a
* chance to get clean backtraces from all CPUs that are oopsing .
*/
if ( in_interrupt ( ) | | panic_on_oops | | ! current - > pid | |
is_global_init ( current ) ) {
mdelay ( MSEC_PER_SEC ) ;
}
2006-07-30 03:03:34 -07:00
if ( panic_on_oops )
2006-08-13 23:24:22 -07:00
panic ( " Fatal exception " ) ;
2011-11-30 00:23:13 +00:00
do_exit ( signr ) ;
}
2016-09-16 20:48:08 +10:00
NOKPROBE_SYMBOL ( oops_end ) ;
2006-07-30 03:03:34 -07:00
2019-07-11 20:28:14 +05:30
static char * get_mmu_str ( void )
{
if ( early_radix_enabled ( ) )
return " MMU=Radix " ;
if ( early_mmu_has_feature ( MMU_FTR_HPTE_TABLE ) )
return " MMU=Hash " ;
return " " ;
}
2016-09-16 20:48:08 +10:00
static int __die ( const char * str , struct pt_regs * regs , long err )
2011-11-30 00:23:13 +00:00
{
printk ( " Oops: %s, sig: %ld [#%d] \n " , str , err , + + die_counter ) ;
2017-08-23 23:56:21 +10:00
2019-07-11 20:28:14 +05:30
printk ( " %s PAGE_SIZE=%luK%s%s%s%s%s%s %s \n " ,
2019-01-10 22:57:35 +11:00
IS_ENABLED ( CONFIG_CPU_LITTLE_ENDIAN ) ? " LE " : " BE " ,
2019-07-11 20:28:14 +05:30
PAGE_SIZE / 1024 , get_mmu_str ( ) ,
2019-01-10 22:57:35 +11:00
IS_ENABLED ( CONFIG_PREEMPT ) ? " PREEMPT " : " " ,
IS_ENABLED ( CONFIG_SMP ) ? " SMP " : " " ,
IS_ENABLED ( CONFIG_SMP ) ? ( " NR_CPUS= " __stringify ( NR_CPUS ) ) : " " ,
debug_pagealloc_enabled ( ) ? " DEBUG_PAGEALLOC " : " " ,
IS_ENABLED ( CONFIG_NUMA ) ? " NUMA " : " " ,
ppc_md . name ? ppc_md . name : " " ) ;
2011-11-30 00:23:13 +00:00
if ( notify_die ( DIE_OOPS , str , regs , err , 255 , SIGSEGV ) = = NOTIFY_STOP )
return 1 ;
print_modules ( ) ;
show_regs ( regs ) ;
2005-09-26 16:04:21 +10:00
return 0 ;
}
2016-09-16 20:48:08 +10:00
NOKPROBE_SYMBOL ( __die ) ;
2005-09-26 16:04:21 +10:00
2011-11-30 00:23:13 +00:00
void die ( const char * str , struct pt_regs * regs , long err )
{
2016-11-08 23:14:44 +11:00
unsigned long flags ;
powerpc/64s: sreset panic if there is no debugger or crash dump handlers
system_reset_exception does most of its own crash handling now,
invoking the debugger or crash dumps if they are registered. If not,
then it goes through to die() to print stack traces, and then is
supposed to panic (according to comments).
However after die() prints oopses, it does its own handling which
doesn't allow system_reset_exception to panic (e.g., it may just
kill the current process). This patch causes sreset exceptions to
return from die after it prints messages but before acting.
This also stops die from invoking the debugger on 0x100 crashes.
system_reset_exception similarly calls the debugger. It had been
thought this was harmless (because if the debugger was disabled,
neither call would fire, and if it was enabled the first call
would return). However in some cases like xmon 'X' command, the
debugger returns 0, which currently causes it to be entered
again (first in system_reset_exception, then in die), which is
confusing.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-27 01:01:16 +10:00
/*
* system_reset_excption handles debugger , crash dump , panic , for 0x100
*/
2021-04-14 19:00:33 +08:00
if ( TRAP ( regs ) ! = INTERRUPT_SYSTEM_RESET ) {
powerpc/64s: sreset panic if there is no debugger or crash dump handlers
system_reset_exception does most of its own crash handling now,
invoking the debugger or crash dumps if they are registered. If not,
then it goes through to die() to print stack traces, and then is
supposed to panic (according to comments).
However after die() prints oopses, it does its own handling which
doesn't allow system_reset_exception to panic (e.g., it may just
kill the current process). This patch causes sreset exceptions to
return from die after it prints messages but before acting.
This also stops die from invoking the debugger on 0x100 crashes.
system_reset_exception similarly calls the debugger. It had been
thought this was harmless (because if the debugger was disabled,
neither call would fire, and if it was enabled the first call
would return). However in some cases like xmon 'X' command, the
debugger returns 0, which currently causes it to be entered
again (first in system_reset_exception, then in die), which is
confusing.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-03-27 01:01:16 +10:00
if ( debugger ( regs ) )
return ;
}
2011-11-30 00:23:13 +00:00
2016-11-08 23:14:44 +11:00
flags = oops_begin ( regs ) ;
2011-11-30 00:23:13 +00:00
if ( __die ( str , regs , err ) )
err = 0 ;
oops_end ( flags , regs , err ) ;
}
2017-06-29 23:19:19 +05:30
NOKPROBE_SYMBOL ( die ) ;
2011-11-30 00:23:13 +00:00
2018-04-16 14:18:26 -05:00
void user_single_step_report ( struct pt_regs * regs )
2009-12-15 16:47:18 -08:00
{
2019-05-23 11:04:24 -05:00
force_sig_fault ( SIGTRAP , TRAP_TRACE , ( void __user * ) regs - > nip ) ;
2009-12-15 16:47:18 -08:00
}
2018-08-16 15:27:47 +10:00
static void show_signal_msg ( int signr , struct pt_regs * regs , int code ,
unsigned long addr )
2018-08-01 18:33:16 -03:00
{
static DEFINE_RATELIMIT_STATE ( rs , DEFAULT_RATELIMIT_INTERVAL ,
DEFAULT_RATELIMIT_BURST ) ;
2018-08-16 15:27:47 +10:00
if ( ! show_unhandled_signals )
2018-08-01 18:33:16 -03:00
return ;
if ( ! unhandled_signal ( current , signr ) )
return ;
2018-08-16 15:27:47 +10:00
if ( ! __ratelimit ( & rs ) )
return ;
2018-08-01 18:33:18 -03:00
pr_info ( " %s[%d]: %s (%d) at %lx nip %lx lr %lx code %x " ,
current - > comm , current - > pid , signame ( signr ) , signr ,
2018-08-01 18:33:17 -03:00
addr , regs - > nip , regs - > link , code ) ;
2018-08-01 18:33:18 -03:00
print_vma_addr ( KERN_CONT " in " , regs - > nip ) ;
pr_cont ( " \n " ) ;
2018-08-01 18:33:20 -03:00
show_user_instructions ( regs ) ;
2018-08-01 18:33:15 -03:00
}
2018-01-18 17:50:42 -08:00
2018-09-18 09:37:28 +02:00
static bool exception_common ( int signr , struct pt_regs * regs , int code ,
unsigned long addr )
2005-09-26 16:04:21 +10:00
{
if ( ! user_mode ( regs ) ) {
2011-11-30 00:23:13 +00:00
die ( " Exception in kernel mode " , regs , signr ) ;
2018-09-18 09:37:28 +02:00
return false ;
2011-11-30 00:23:13 +00:00
}
2018-08-01 18:33:15 -03:00
show_signal_msg ( signr , regs , code , addr ) ;
2005-09-26 16:04:21 +10:00
2021-01-30 23:08:39 +10:00
if ( arch_irqs_disabled ( ) )
interrupt_cond_local_irq_enable ( regs ) ;
2012-03-01 15:47:44 +11:00
2012-08-23 21:27:09 +00:00
current - > thread . trap_nr = code ;
2018-01-18 17:50:43 -08:00
2018-09-18 09:37:28 +02:00
return true ;
}
2018-09-18 10:56:25 +02:00
void _exception_pkey ( struct pt_regs * regs , unsigned long addr , int key )
2018-09-18 09:37:28 +02:00
{
2018-09-18 10:56:25 +02:00
if ( ! exception_common ( SIGSEGV , regs , SEGV_PKUERR , addr ) )
2018-09-18 09:37:28 +02:00
return ;
2018-09-18 11:26:32 +02:00
force_sig_pkuerr ( ( void __user * ) addr , key ) ;
2005-09-26 16:04:21 +10:00
}
2018-01-18 17:50:42 -08:00
void _exception ( int signr , struct pt_regs * regs , int code , unsigned long addr )
{
2018-09-18 09:43:32 +02:00
if ( ! exception_common ( signr , regs , code , addr ) )
return ;
2019-05-23 11:04:24 -05:00
force_sig_fault ( signr , code , ( void __user * ) addr ) ;
2018-01-18 17:50:42 -08:00
}
2019-02-26 18:51:07 +10:00
/*
* The interrupt architecture has a quirk in that the HV interrupts excluding
* the NMIs ( 0x100 and 0x200 ) do not clear MSR [ RI ] at entry . The first thing
* that an interrupt handler must do is save off a GPR into a scratch register ,
* and all interrupts on POWERNV ( HV = 1 ) use the HSPRG1 register as scratch .
* Therefore an NMI can clobber an HV interrupt ' s live HSPRG1 without noticing
* that it is non - reentrant , which leads to random data corruption .
*
* The solution is for NMI interrupts in HV mode to check if they originated
* from these critical HV interrupt regions . If so , then mark them not
* recoverable .
*
* An alternative would be for HV NMIs to use SPRG for scratch to avoid the
* HSPRG1 clobber , however this would cause guest SPRG to be clobbered . Linux
* guests should always have MSR [ RI ] = 0 when its scratch SPRG is in use , so
* that would work . However any other guest OS that may have the SPRG live
* and MSR [ RI ] = 1 could encounter silent corruption .
*
* Builds that do not support KVM could take this second option to increase
* the recoverability of NMIs .
*/
void hv_nmi_check_nonrecoverable ( struct pt_regs * regs )
{
# ifdef CONFIG_PPC_POWERNV
unsigned long kbase = ( unsigned long ) _stext ;
unsigned long nip = regs - > nip ;
if ( ! ( regs - > msr & MSR_RI ) )
return ;
if ( ! ( regs - > msr & MSR_HV ) )
return ;
if ( regs - > msr & MSR_PR )
return ;
/*
* Now test if the interrupt has hit a range that may be using
* HSPRG1 without having RI = 0 ( i . e . , an HSRR interrupt ) . The
* problem ranges all run un - relocated . Test real and virt modes
2021-02-24 13:25:47 +05:30
* at the same time by dropping the high bit of the nip ( virt mode
2019-02-26 18:51:07 +10:00
* entry points still have the + 0x4000 offset ) .
*/
nip & = ~ 0xc000000000000000ULL ;
if ( ( nip > = 0x500 & & nip < 0x600 ) | | ( nip > = 0x4500 & & nip < 0x4600 ) )
goto nonrecoverable ;
if ( ( nip > = 0x980 & & nip < 0xa00 ) | | ( nip > = 0x4980 & & nip < 0x4a00 ) )
goto nonrecoverable ;
if ( ( nip > = 0xe00 & & nip < 0xec0 ) | | ( nip > = 0x4e00 & & nip < 0x4ec0 ) )
goto nonrecoverable ;
if ( ( nip > = 0xf80 & & nip < 0xfa0 ) | | ( nip > = 0x4f80 & & nip < 0x4fa0 ) )
goto nonrecoverable ;
2019-03-01 22:56:36 +10:00
2019-02-26 18:51:07 +10:00
/* Trampoline code runs un-relocated so subtract kbase. */
2019-03-01 22:56:36 +10:00
if ( nip > = ( unsigned long ) ( start_real_trampolines - kbase ) & &
nip < ( unsigned long ) ( end_real_trampolines - kbase ) )
2019-02-26 18:51:07 +10:00
goto nonrecoverable ;
2019-03-01 22:56:36 +10:00
if ( nip > = ( unsigned long ) ( start_virt_trampolines - kbase ) & &
nip < ( unsigned long ) ( end_virt_trampolines - kbase ) )
2019-02-26 18:51:07 +10:00
goto nonrecoverable ;
return ;
nonrecoverable :
2021-06-18 01:51:03 +10:00
regs_set_return_msr ( regs , regs - > msr & ~ MSR_RI ) ;
2019-02-26 18:51:07 +10:00
# endif
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER_NMI ( system_reset_exception )
2005-09-26 16:04:21 +10:00
{
2019-02-26 18:51:08 +10:00
unsigned long hsrr0 , hsrr1 ;
bool saved_hsrrs = false ;
2016-12-20 04:30:07 +10:00
2019-02-26 18:51:08 +10:00
/*
* System reset can interrupt code where HSRRs are live and MSR [ RI ] = 1.
* The system reset interrupt itself may clobber HSRRs ( e . g . , to call
* OPAL ) , so save them here and restore them before returning .
*
* Machine checks don ' t need to save HSRRs , as the real mode handler
* is careful to avoid them , and the regular handler is not delivered
* as an NMI .
*/
if ( cpu_has_feature ( CPU_FTR_HVMODE ) ) {
hsrr0 = mfspr ( SPRN_HSRR0 ) ;
hsrr1 = mfspr ( SPRN_HSRR1 ) ;
saved_hsrrs = true ;
}
2019-02-26 18:51:07 +10:00
hv_nmi_check_nonrecoverable ( regs ) ;
2017-08-01 22:00:53 +10:00
__this_cpu_inc ( irq_stat . sreset_irqs ) ;
2005-09-26 16:04:21 +10:00
/* See if any machine dependent calls */
2006-01-04 19:55:53 +00:00
if ( ppc_md . system_reset_exception ) {
if ( ppc_md . system_reset_exception ( regs ) )
2016-12-20 04:30:05 +10:00
goto out ;
2006-01-04 19:55:53 +00:00
}
2005-09-26 16:04:21 +10:00
2017-07-05 13:56:27 +10:00
if ( debugger ( regs ) )
goto out ;
2019-09-04 13:29:49 +05:30
kmsg_dump ( KMSG_DUMP_OOPS ) ;
2017-07-05 13:56:27 +10:00
/*
* A system reset is a request to dump , so we always send
* it through the crashdump code ( if fadump or kdump are
* registered ) .
*/
crash_fadump ( regs , " System Reset " ) ;
crash_kexec ( regs ) ;
/*
* We aren ' t the primary crash CPU . We need to send it
* to a holding pattern to avoid it ending up in the panic
* code .
*/
crash_kexec_secondary ( regs ) ;
/*
* No debugger or crash dump registered , print logs then
* panic .
*/
2017-12-24 02:49:22 +10:00
die ( " System Reset " , regs , SIGABRT ) ;
2017-07-05 13:56:27 +10:00
mdelay ( 2 * MSEC_PER_SEC ) ; /* Wait a little while for others to print */
add_taint ( TAINT_DIE , LOCKDEP_NOW_UNRELIABLE ) ;
nmi_panic ( regs , " System Reset " ) ;
2005-09-26 16:04:21 +10:00
2016-12-20 04:30:05 +10:00
out :
# ifdef CONFIG_PPC_BOOK3S_64
BUG_ON ( get_paca ( ) - > in_nmi = = 0 ) ;
if ( get_paca ( ) - > in_nmi > 1 )
2020-05-08 14:34:07 +10:00
die ( " Unrecoverable nested System Reset " , regs , SIGABRT ) ;
2016-12-20 04:30:05 +10:00
# endif
2005-09-26 16:04:21 +10:00
/* Must die if the interrupt is not recoverable */
2021-01-30 23:08:35 +10:00
if ( ! ( regs - > msr & MSR_RI ) ) {
/* For the reason explained in die_mce, nmi_exit before die */
nmi_exit ( ) ;
2020-05-08 14:34:07 +10:00
die ( " Unrecoverable System Reset " , regs , SIGABRT ) ;
2021-01-30 23:08:35 +10:00
}
2005-09-26 16:04:21 +10:00
2019-02-26 18:51:08 +10:00
if ( saved_hsrrs ) {
mtspr ( SPRN_HSRR0 , hsrr0 ) ;
mtspr ( SPRN_HSRR1 , hsrr1 ) ;
}
2005-09-26 16:04:21 +10:00
/* What should we do here? We could issue a shutdown or hard reset. */
2021-01-30 23:08:38 +10:00
return 0 ;
2005-09-26 16:04:21 +10:00
}
powerpc/book3s: handle machine check in Linux host.
Move machine check entry point into Linux. So far we were dependent on
firmware to decode MCE error details and handover the high level info to OS.
This patch introduces early machine check routine that saves the MCE
information (srr1, srr0, dar and dsisr) to the emergency stack. We allocate
stack frame on emergency stack and set the r1 accordingly. This allows us to be
prepared to take another exception without loosing context. One thing to note
here that, if we get another machine check while ME bit is off then we risk a
checkstop. Hence we restrict ourselves to save only MCE information and
register saved on PACA_EXMC save are before we turn the ME bit on. We use
paca->in_mce flag to differentiate between first entry and nested machine check
entry which helps proper use of emergency stack. We increment paca->in_mce
every time we enter in early machine check handler and decrement it while
leaving. When we enter machine check early handler first time (paca->in_mce ==
0), we are sure nobody is using MC emergency stack and allocate a stack frame
at the start of the emergency stack. During subsequent entry (paca->in_mce >
0), we know that r1 points inside emergency stack and we allocate separate
stack frame accordingly. This prevents us from clobbering MCE information
during nested machine checks.
The early machine check handler changes are placed under CPU_FTR_HVMODE
section. This makes sure that the early machine check handler will get executed
only in hypervisor kernel.
This is the code flow:
Machine Check Interrupt
|
V
0x200 vector ME=0, IR=0, DR=0
|
V
+-----------------------------------------------+
|machine_check_pSeries_early: | ME=0, IR=0, DR=0
| Alloc frame on emergency stack |
| Save srr1, srr0, dar and dsisr on stack |
+-----------------------------------------------+
|
(ME=1, IR=0, DR=0, RFID)
|
V
machine_check_handle_early ME=1, IR=0, DR=0
|
V
+-----------------------------------------------+
| machine_check_early (r3=pt_regs) | ME=1, IR=0, DR=0
| Things to do: (in next patches) |
| Flush SLB for SLB errors |
| Flush TLB for TLB errors |
| Decode and save MCE info |
+-----------------------------------------------+
|
(Fall through existing exception handler routine.)
|
V
machine_check_pSerie ME=1, IR=0, DR=0
|
(ME=1, IR=1, DR=1, RFID)
|
V
machine_check_common ME=1, IR=1, DR=1
.
.
.
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-10-30 20:04:08 +05:30
2005-09-26 16:04:21 +10:00
/*
* I / O accesses can cause machine checks on powermacs .
* Check if the NIP corresponds to the address of a sync
* instruction for which there is an entry in the exception
* table .
* - - paulus .
*/
static inline int check_io_access ( struct pt_regs * regs )
{
2006-11-13 09:27:39 +11:00
# ifdef CONFIG_PPC32
2005-09-26 16:04:21 +10:00
unsigned long msr = regs - > msr ;
const struct exception_table_entry * entry ;
unsigned int * nip = ( unsigned int * ) regs - > nip ;
if ( ( ( msr & 0xffff0000 ) = = 0 | | ( msr & ( 0x80000 | 0x40000 ) ) )
& & ( entry = search_exception_tables ( regs - > nip ) ) ! = NULL ) {
/*
* Check that it ' s a sync instruction , or somewhere
* in the twi ; isync ; nop sequence that inb / inw / inl uses .
* As the address is in the exception table
* we should be able to read the instr there .
* For the debug message , we look at the preceding
* load or store .
*/
2021-05-20 10:23:10 +00:00
if ( * nip = = PPC_RAW_NOP ( ) )
2005-09-26 16:04:21 +10:00
nip - = 2 ;
2021-05-20 10:23:10 +00:00
else if ( * nip = = PPC_RAW_ISYNC ( ) )
2005-09-26 16:04:21 +10:00
- - nip ;
2021-05-20 10:23:10 +00:00
if ( * nip = = PPC_RAW_SYNC ( ) | | get_op ( * nip ) = = OP_TRAP ) {
2005-09-26 16:04:21 +10:00
unsigned int rb ;
- - nip ;
rb = ( * nip > > 11 ) & 0x1f ;
printk ( KERN_DEBUG " %s bad port %lx at %p \n " ,
( * nip & 0x100 ) ? " OUT to " : " IN from " ,
regs - > gpr [ rb ] - _IO_BASE , nip ) ;
2021-06-18 01:51:03 +10:00
regs_set_return_msr ( regs , regs - > msr | MSR_RI ) ;
regs_set_return_ip ( regs , extable_fixup ( entry ) ) ;
2005-09-26 16:04:21 +10:00
return 1 ;
}
}
2006-11-13 09:27:39 +11:00
# endif /* CONFIG_PPC32 */
2005-09-26 16:04:21 +10:00
return 0 ;
}
2010-02-08 11:50:57 +00:00
# ifdef CONFIG_PPC_ADV_DEBUG_REGS
2005-09-26 16:04:21 +10:00
/* On 4xx, the reason for the machine check or program exception
is in the ESR . */
# define get_reason(regs) ((regs)->dsisr)
# define REASON_FP ESR_FP
# define REASON_ILLEGAL (ESR_PIL | ESR_PUO)
# define REASON_PRIVILEGED ESR_PPR
# define REASON_TRAP ESR_PTR
2020-05-06 13:40:48 +10:00
# define REASON_PREFIXED 0
# define REASON_BOUNDARY 0
2005-09-26 16:04:21 +10:00
/* single-step stuff */
2013-07-04 11:45:46 +05:30
# define single_stepping(regs) (current->thread.debug.dbcr0 & DBCR0_IC)
# define clear_single_step(regs) (current->thread.debug.dbcr0 &= ~DBCR0_IC)
2018-03-26 17:55:21 +01:00
# define clear_br_trace(regs) do {} while(0)
2005-09-26 16:04:21 +10:00
# else
/* On non-4xx, the reason for the machine check or program
exception is in the MSR . */
# define get_reason(regs) ((regs)->msr)
2017-08-08 16:39:25 +10:00
# define REASON_TM SRR1_PROGTM
# define REASON_FP SRR1_PROGFPE
# define REASON_ILLEGAL SRR1_PROGILL
# define REASON_PRIVILEGED SRR1_PROGPRIV
# define REASON_TRAP SRR1_PROGTRAP
2020-05-06 13:40:48 +10:00
# define REASON_PREFIXED SRR1_PREFIXED
# define REASON_BOUNDARY SRR1_BOUNDARY
2005-09-26 16:04:21 +10:00
# define single_stepping(regs) ((regs)->msr & MSR_SE)
2021-06-18 01:51:03 +10:00
# define clear_single_step(regs) (regs_set_return_msr((regs), (regs)->msr & ~MSR_SE))
# define clear_br_trace(regs) (regs_set_return_msr((regs), (regs)->msr & ~MSR_BE))
2005-09-26 16:04:21 +10:00
# endif
2020-05-06 13:40:48 +10:00
# define inst_length(reason) (((reason) & REASON_PREFIXED) ? 8 : 4)
2017-08-08 16:39:21 +10:00
# if defined(CONFIG_E500)
2010-04-08 00:38:22 -05:00
int machine_check_e500mc ( struct pt_regs * regs )
{
unsigned long mcsr = mfspr ( SPRN_MCSR ) ;
2017-06-28 11:14:29 -05:00
unsigned long pvr = mfspr ( SPRN_PVR ) ;
2010-04-08 00:38:22 -05:00
unsigned long reason = mcsr ;
int recoverable = 1 ;
2011-06-16 14:09:17 -05:00
if ( reason & MCSR_LD ) {
2010-11-18 14:57:32 +08:00
recoverable = fsl_rio_mcheck_exception ( regs ) ;
if ( recoverable = = 1 )
goto silent_out ;
}
2010-04-08 00:38:22 -05:00
printk ( " Machine check in kernel mode. \n " ) ;
printk ( " Caused by (from MCSR=%lx): " , reason ) ;
if ( reason & MCSR_MCP )
2018-10-15 07:20:45 +00:00
pr_cont ( " Machine Check Signal \n " ) ;
2010-04-08 00:38:22 -05:00
if ( reason & MCSR_ICPERR ) {
2018-10-15 07:20:45 +00:00
pr_cont ( " Instruction Cache Parity Error \n " ) ;
2010-04-08 00:38:22 -05:00
/*
* This is recoverable by invalidating the i - cache .
*/
mtspr ( SPRN_L1CSR1 , mfspr ( SPRN_L1CSR1 ) | L1CSR1_ICFI ) ;
while ( mfspr ( SPRN_L1CSR1 ) & L1CSR1_ICFI )
;
/*
* This will generally be accompanied by an instruction
* fetch error report - - only treat MCSR_IF as fatal
* if it wasn ' t due to an L1 parity error .
*/
reason & = ~ MCSR_IF ;
}
if ( reason & MCSR_DCPERR_MC ) {
2018-10-15 07:20:45 +00:00
pr_cont ( " Data Cache Parity Error \n " ) ;
2011-08-27 06:14:23 -05:00
/*
* In write shadow mode we auto - recover from the error , but it
* may still get logged and cause a machine check . We should
* only treat the non - write shadow case as non - recoverable .
*/
2017-06-28 11:14:29 -05:00
/* On e6500 core, L1 DCWS (Data cache write shadow mode) bit
* is not implemented but L1 data cache always runs in write
* shadow mode . Hence on data cache parity errors HW will
* automatically invalidate the L1 Data Cache .
*/
if ( PVR_VER ( pvr ) ! = PVR_VER_E6500 ) {
if ( ! ( mfspr ( SPRN_L1CSR2 ) & L1CSR2_DCWS ) )
recoverable = 0 ;
}
2010-04-08 00:38:22 -05:00
}
if ( reason & MCSR_L2MMU_MHIT ) {
2018-10-15 07:20:45 +00:00
pr_cont ( " Hit on multiple TLB entries \n " ) ;
2010-04-08 00:38:22 -05:00
recoverable = 0 ;
}
if ( reason & MCSR_NMI )
2018-10-15 07:20:45 +00:00
pr_cont ( " Non-maskable interrupt \n " ) ;
2010-04-08 00:38:22 -05:00
if ( reason & MCSR_IF ) {
2018-10-15 07:20:45 +00:00
pr_cont ( " Instruction Fetch Error Report \n " ) ;
2010-04-08 00:38:22 -05:00
recoverable = 0 ;
}
if ( reason & MCSR_LD ) {
2018-10-15 07:20:45 +00:00
pr_cont ( " Load Error Report \n " ) ;
2010-04-08 00:38:22 -05:00
recoverable = 0 ;
}
if ( reason & MCSR_ST ) {
2018-10-15 07:20:45 +00:00
pr_cont ( " Store Error Report \n " ) ;
2010-04-08 00:38:22 -05:00
recoverable = 0 ;
}
if ( reason & MCSR_LDG ) {
2018-10-15 07:20:45 +00:00
pr_cont ( " Guarded Load Error Report \n " ) ;
2010-04-08 00:38:22 -05:00
recoverable = 0 ;
}
if ( reason & MCSR_TLBSYNC )
2018-10-15 07:20:45 +00:00
pr_cont ( " Simultaneous tlbsync operations \n " ) ;
2010-04-08 00:38:22 -05:00
if ( reason & MCSR_BSL2_ERR ) {
2018-10-15 07:20:45 +00:00
pr_cont ( " Level 2 Cache Error \n " ) ;
2010-04-08 00:38:22 -05:00
recoverable = 0 ;
}
if ( reason & MCSR_MAV ) {
u64 addr ;
addr = mfspr ( SPRN_MCAR ) ;
addr | = ( u64 ) mfspr ( SPRN_MCARU ) < < 32 ;
2018-10-15 07:20:45 +00:00
pr_cont ( " Machine Check %s Address: %#llx \n " ,
2010-04-08 00:38:22 -05:00
reason & MCSR_MEA ? " Effective " : " Physical " , addr ) ;
}
2010-11-18 14:57:32 +08:00
silent_out :
2010-04-08 00:38:22 -05:00
mtspr ( SPRN_MCSR , mcsr ) ;
return mfspr ( SPRN_MCSR ) = = 0 & & recoverable ;
}
2007-12-21 15:39:21 +11:00
int machine_check_e500 ( struct pt_regs * regs )
{
2017-08-08 16:39:22 +10:00
unsigned long reason = mfspr ( SPRN_MCSR ) ;
2007-12-21 15:39:21 +11:00
2010-11-18 14:57:32 +08:00
if ( reason & MCSR_BUS_RBERR ) {
if ( fsl_rio_mcheck_exception ( regs ) )
return 1 ;
2013-04-28 13:20:08 +08:00
if ( fsl_pci_mcheck_exception ( regs ) )
return 1 ;
2010-11-18 14:57:32 +08:00
}
2005-09-26 16:04:21 +10:00
printk ( " Machine check in kernel mode. \n " ) ;
printk ( " Caused by (from MCSR=%lx): " , reason ) ;
if ( reason & MCSR_MCP )
2018-10-15 07:20:45 +00:00
pr_cont ( " Machine Check Signal \n " ) ;
2005-09-26 16:04:21 +10:00
if ( reason & MCSR_ICPERR )
2018-10-15 07:20:45 +00:00
pr_cont ( " Instruction Cache Parity Error \n " ) ;
2005-09-26 16:04:21 +10:00
if ( reason & MCSR_DCP_PERR )
2018-10-15 07:20:45 +00:00
pr_cont ( " Data Cache Push Parity Error \n " ) ;
2005-09-26 16:04:21 +10:00
if ( reason & MCSR_DCPERR )
2018-10-15 07:20:45 +00:00
pr_cont ( " Data Cache Parity Error \n " ) ;
2005-09-26 16:04:21 +10:00
if ( reason & MCSR_BUS_IAERR )
2018-10-15 07:20:45 +00:00
pr_cont ( " Bus - Instruction Address Error \n " ) ;
2005-09-26 16:04:21 +10:00
if ( reason & MCSR_BUS_RAERR )
2018-10-15 07:20:45 +00:00
pr_cont ( " Bus - Read Address Error \n " ) ;
2005-09-26 16:04:21 +10:00
if ( reason & MCSR_BUS_WAERR )
2018-10-15 07:20:45 +00:00
pr_cont ( " Bus - Write Address Error \n " ) ;
2005-09-26 16:04:21 +10:00
if ( reason & MCSR_BUS_IBERR )
2018-10-15 07:20:45 +00:00
pr_cont ( " Bus - Instruction Data Error \n " ) ;
2005-09-26 16:04:21 +10:00
if ( reason & MCSR_BUS_RBERR )
2018-10-15 07:20:45 +00:00
pr_cont ( " Bus - Read Data Bus Error \n " ) ;
2005-09-26 16:04:21 +10:00
if ( reason & MCSR_BUS_WBERR )
2018-10-15 07:20:45 +00:00
pr_cont ( " Bus - Write Data Bus Error \n " ) ;
2005-09-26 16:04:21 +10:00
if ( reason & MCSR_BUS_IPERR )
2018-10-15 07:20:45 +00:00
pr_cont ( " Bus - Instruction Parity Error \n " ) ;
2005-09-26 16:04:21 +10:00
if ( reason & MCSR_BUS_RPERR )
2018-10-15 07:20:45 +00:00
pr_cont ( " Bus - Read Parity Error \n " ) ;
2007-12-21 15:39:21 +11:00
return 0 ;
}
2010-10-08 08:32:11 -05:00
int machine_check_generic ( struct pt_regs * regs )
{
return 0 ;
}
2017-08-08 16:39:23 +10:00
# elif defined(CONFIG_PPC32)
2007-12-21 15:39:21 +11:00
int machine_check_generic ( struct pt_regs * regs )
{
2017-08-08 16:39:22 +10:00
unsigned long reason = regs - > msr ;
2007-12-21 15:39:21 +11:00
2005-09-26 16:04:21 +10:00
printk ( " Machine check in kernel mode. \n " ) ;
printk ( " Caused by (from SRR1=%lx): " , reason ) ;
switch ( reason & 0x601F0000 ) {
case 0x80000 :
2018-10-15 07:20:45 +00:00
pr_cont ( " Machine check signal \n " ) ;
2005-09-26 16:04:21 +10:00
break ;
case 0x40000 :
case 0x140000 : /* 7450 MSS error and TEA */
2018-10-15 07:20:45 +00:00
pr_cont ( " Transfer error ack signal \n " ) ;
2005-09-26 16:04:21 +10:00
break ;
case 0x20000 :
2018-10-15 07:20:45 +00:00
pr_cont ( " Data parity error signal \n " ) ;
2005-09-26 16:04:21 +10:00
break ;
case 0x10000 :
2018-10-15 07:20:45 +00:00
pr_cont ( " Address parity error signal \n " ) ;
2005-09-26 16:04:21 +10:00
break ;
case 0x20000000 :
2018-10-15 07:20:45 +00:00
pr_cont ( " L1 Data Cache error \n " ) ;
2005-09-26 16:04:21 +10:00
break ;
case 0x40000000 :
2018-10-15 07:20:45 +00:00
pr_cont ( " L1 Instruction Cache error \n " ) ;
2005-09-26 16:04:21 +10:00
break ;
case 0x00100000 :
2018-10-15 07:20:45 +00:00
pr_cont ( " L2 data cache parity error \n " ) ;
2005-09-26 16:04:21 +10:00
break ;
default :
2018-10-15 07:20:45 +00:00
pr_cont ( " Unknown values in msr \n " ) ;
2005-09-26 16:04:21 +10:00
}
2007-09-21 05:11:20 +10:00
return 0 ;
}
2007-12-21 15:39:21 +11:00
# endif /* everything else */
2007-09-21 05:11:20 +10:00
2021-01-30 23:08:33 +10:00
void die_mce ( const char * str , struct pt_regs * regs , long err )
{
/*
* The machine check wants to kill the interrupted context , but
* do_exit ( ) checks for in_interrupt ( ) and panics in that case , so
* exit the irq / nmi before calling die .
*/
2021-01-30 23:08:44 +10:00
if ( IS_ENABLED ( CONFIG_PPC_BOOK3S_64 ) )
irq_exit ( ) ;
else
2021-01-30 23:08:33 +10:00
nmi_exit ( ) ;
die ( str , regs , err ) ;
}
2021-01-30 23:08:49 +10:00
/*
* BOOK3S_64 does not call this handler as a non - maskable interrupt
* ( it uses its own early real - mode handler to handle the MCE proper
* and then raises irq_work to call this handler when interrupts are
* enabled ) .
*/
2021-01-30 23:08:38 +10:00
# ifdef CONFIG_PPC_BOOK3S_64
DEFINE_INTERRUPT_HANDLER_ASYNC ( machine_check_exception )
# else
DEFINE_INTERRUPT_HANDLER_NMI ( machine_check_exception )
# endif
2007-09-21 05:11:20 +10:00
{
int recover = 0 ;
2020-02-19 09:46:47 +01:00
2018-09-26 14:24:30 +02:00
__this_cpu_inc ( irq_stat . mce_exceptions ) ;
2010-01-31 20:34:06 +00:00
2017-04-18 22:08:17 +05:30
add_taint ( TAINT_MACHINE_CHECK , LOCKDEP_NOW_UNRELIABLE ) ;
2007-12-21 15:39:21 +11:00
/* See if any machine dependent calls. In theory, we would want
* to call the CPU first , and call the ppc_md . one if the CPU
* one returns a positive number . However there is existing code
* that assumes the board gets a first chance , so let ' s keep it
* that way for now and fix things later . - - BenH .
*/
2007-09-21 05:11:20 +10:00
if ( ppc_md . machine_check_exception )
recover = ppc_md . machine_check_exception ( regs ) ;
2007-12-21 15:39:21 +11:00
else if ( cur_cpu_spec - > machine_check )
recover = cur_cpu_spec - > machine_check ( regs ) ;
2007-09-21 05:11:20 +10:00
2007-12-21 15:39:21 +11:00
if ( recover > 0 )
powerpc: Exception hooks for context tracking subsystem
This is the exception hooks for context tracking subsystem, including
data access, program check, single step, instruction breakpoint, machine check,
alignment, fp unavailable, altivec assist, unknown exception, whose handlers
might use RCU.
This patch corresponds to
[PATCH] x86: Exception hooks for userspace RCU extended QS
commit 6ba3c97a38803883c2eee489505796cb0a727122
But after the exception handling moved to generic code, and some changes in
following two commits:
56dd9470d7c8734f055da2a6bac553caf4a468eb
context_tracking: Move exception handling to generic code
6c1e0256fad84a843d915414e4b5973b7443d48d
context_tracking: Restore correct previous context state on exception exit
it is able for exception hooks to use the generic code above instead of a
redundant arch implementation.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-13 16:16:41 +00:00
goto bail ;
2007-09-21 05:11:20 +10:00
2011-01-11 19:45:31 +00:00
if ( debugger_fault_handler ( regs ) )
powerpc: Exception hooks for context tracking subsystem
This is the exception hooks for context tracking subsystem, including
data access, program check, single step, instruction breakpoint, machine check,
alignment, fp unavailable, altivec assist, unknown exception, whose handlers
might use RCU.
This patch corresponds to
[PATCH] x86: Exception hooks for userspace RCU extended QS
commit 6ba3c97a38803883c2eee489505796cb0a727122
But after the exception handling moved to generic code, and some changes in
following two commits:
56dd9470d7c8734f055da2a6bac553caf4a468eb
context_tracking: Move exception handling to generic code
6c1e0256fad84a843d915414e4b5973b7443d48d
context_tracking: Restore correct previous context state on exception exit
it is able for exception hooks to use the generic code above instead of a
redundant arch implementation.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-13 16:16:41 +00:00
goto bail ;
2007-09-21 05:11:20 +10:00
if ( check_io_access ( regs ) )
powerpc: Exception hooks for context tracking subsystem
This is the exception hooks for context tracking subsystem, including
data access, program check, single step, instruction breakpoint, machine check,
alignment, fp unavailable, altivec assist, unknown exception, whose handlers
might use RCU.
This patch corresponds to
[PATCH] x86: Exception hooks for userspace RCU extended QS
commit 6ba3c97a38803883c2eee489505796cb0a727122
But after the exception handling moved to generic code, and some changes in
following two commits:
56dd9470d7c8734f055da2a6bac553caf4a468eb
context_tracking: Move exception handling to generic code
6c1e0256fad84a843d915414e4b5973b7443d48d
context_tracking: Restore correct previous context state on exception exit
it is able for exception hooks to use the generic code above instead of a
redundant arch implementation.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-13 16:16:41 +00:00
goto bail ;
2007-09-21 05:11:20 +10:00
2021-01-30 23:08:33 +10:00
die_mce ( " Machine check " , regs , SIGBUS ) ;
2018-10-13 09:16:22 +00:00
2021-01-30 23:08:34 +10:00
bail :
2019-01-22 14:11:24 +00:00
/* Must die if the interrupt is not recoverable */
if ( ! ( regs - > msr & MSR_RI ) )
2021-01-30 23:08:33 +10:00
die_mce ( " Unrecoverable Machine check " , regs , SIGBUS ) ;
2018-10-13 09:16:22 +00:00
2021-01-30 23:08:38 +10:00
# ifdef CONFIG_PPC_BOOK3S_64
return ;
# else
return 0 ;
# endif
2005-09-26 16:04:21 +10:00
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( SMIException ) /* async? */
2005-09-26 16:04:21 +10:00
{
die ( " System Management Interrupt " , regs , SIGABRT ) ;
}
2017-09-15 15:25:48 +10:00
# ifdef CONFIG_VSX
static void p9_hmi_special_emu ( struct pt_regs * regs )
{
unsigned int ra , rb , t , i , sel , instr , rc ;
const void __user * addr ;
2020-10-13 15:37:40 +11:00
u8 vbuf [ 16 ] __aligned ( 16 ) , * vdst ;
2017-09-15 15:25:48 +10:00
unsigned long ea , msr , msr_mask ;
bool swap ;
2021-03-10 17:46:43 +00:00
if ( __get_user ( instr , ( unsigned int __user * ) regs - > nip ) )
2017-09-15 15:25:48 +10:00
return ;
/*
* lxvb16x opcode : 0x7c0006d8
* lxvd2x opcode : 0x7c000698
* lxvh8x opcode : 0x7c000658
* lxvw4x opcode : 0x7c000618
*/
if ( ( instr & 0xfc00073e ) ! = 0x7c000618 ) {
pr_devel ( " HMI vec emu: not vector CI %i:%s[%d] nip=%016lx "
" instr=%08x \n " ,
smp_processor_id ( ) , current - > comm , current - > pid ,
regs - > nip , instr ) ;
return ;
}
/* Grab vector registers into the task struct */
msr = regs - > msr ; /* Grab msr before we flush the bits */
flush_vsx_to_thread ( current ) ;
enable_kernel_altivec ( ) ;
/*
* Is userspace running with a different endian ( this is rare but
* not impossible )
*/
swap = ( msr & MSR_LE ) ! = ( MSR_KERNEL & MSR_LE ) ;
/* Decode the instruction */
ra = ( instr > > 16 ) & 0x1f ;
rb = ( instr > > 11 ) & 0x1f ;
t = ( instr > > 21 ) & 0x1f ;
if ( instr & 1 )
vdst = ( u8 * ) & current - > thread . vr_state . vr [ t ] ;
else
vdst = ( u8 * ) & current - > thread . fp_state . fpr [ t ] [ 0 ] ;
/* Grab the vector address */
ea = regs - > gpr [ rb ] + ( ra ? regs - > gpr [ ra ] : 0 ) ;
if ( is_32bit_task ( ) )
ea & = 0xfffffffful ;
addr = ( __force const void __user * ) ea ;
/* Check it */
Remove 'type' argument from access_ok() function
Nobody has actually used the type (VERIFY_READ vs VERIFY_WRITE) argument
of the user address range verification function since we got rid of the
old racy i386-only code to walk page tables by hand.
It existed because the original 80386 would not honor the write protect
bit when in kernel mode, so you had to do COW by hand before doing any
user access. But we haven't supported that in a long time, and these
days the 'type' argument is a purely historical artifact.
A discussion about extending 'user_access_begin()' to do the range
checking resulted this patch, because there is no way we're going to
move the old VERIFY_xyz interface to that model. And it's best done at
the end of the merge window when I've done most of my merges, so let's
just get this done once and for all.
This patch was mostly done with a sed-script, with manual fix-ups for
the cases that weren't of the trivial 'access_ok(VERIFY_xyz' form.
There were a couple of notable cases:
- csky still had the old "verify_area()" name as an alias.
- the iter_iov code had magical hardcoded knowledge of the actual
values of VERIFY_{READ,WRITE} (not that they mattered, since nothing
really used it)
- microblaze used the type argument for a debug printout
but other than those oddities this should be a total no-op patch.
I tried to fix up all architectures, did fairly extensive grepping for
access_ok() uses, and the changes are trivial, but I may have missed
something. Any missed conversion should be trivially fixable, though.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-01-03 18:57:57 -08:00
if ( ! access_ok ( addr , 16 ) ) {
2017-09-15 15:25:48 +10:00
pr_devel ( " HMI vec emu: bad access %i:%s[%d] nip=%016lx "
" instr=%08x addr=%016lx \n " ,
smp_processor_id ( ) , current - > comm , current - > pid ,
regs - > nip , instr , ( unsigned long ) addr ) ;
return ;
}
/* Read the vector */
rc = 0 ;
if ( ( unsigned long ) addr & 0xfUL )
/* unaligned case */
rc = __copy_from_user_inatomic ( vbuf , addr , 16 ) ;
else
__get_user_atomic_128_aligned ( vbuf , addr , rc ) ;
if ( rc ) {
pr_devel ( " HMI vec emu: page fault %i:%s[%d] nip=%016lx "
" instr=%08x addr=%016lx \n " ,
smp_processor_id ( ) , current - > comm , current - > pid ,
regs - > nip , instr , ( unsigned long ) addr ) ;
return ;
}
pr_devel ( " HMI vec emu: emulated vector CI %i:%s[%d] nip=%016lx "
" instr=%08x addr=%016lx \n " ,
smp_processor_id ( ) , current - > comm , current - > pid , regs - > nip ,
instr , ( unsigned long ) addr ) ;
/* Grab instruction "selector" */
sel = ( instr > > 6 ) & 3 ;
/*
* Check to make sure the facility is actually enabled . This
* could happen if we get a false positive hit .
*
* lxvd2x / lxvw4x always check MSR VSX sel = 0 , 2
* lxvh8x / lxvb16x check MSR VSX or VEC depending on VSR used sel = 1 , 3
*/
msr_mask = MSR_VSX ;
if ( ( sel & 1 ) & & ( instr & 1 ) ) /* lxvh8x & lxvb16x + VSR >= 32 */
msr_mask = MSR_VEC ;
if ( ! ( msr & msr_mask ) ) {
pr_devel ( " HMI vec emu: MSR fac clear %i:%s[%d] nip=%016lx "
" instr=%08x msr:%016lx \n " ,
smp_processor_id ( ) , current - > comm , current - > pid ,
regs - > nip , instr , msr ) ;
return ;
}
/* Do logging here before we modify sel based on endian */
switch ( sel ) {
case 0 : /* lxvw4x */
PPC_WARN_EMULATED ( lxvw4x , regs ) ;
break ;
case 1 : /* lxvh8x */
PPC_WARN_EMULATED ( lxvh8x , regs ) ;
break ;
case 2 : /* lxvd2x */
PPC_WARN_EMULATED ( lxvd2x , regs ) ;
break ;
case 3 : /* lxvb16x */
PPC_WARN_EMULATED ( lxvb16x , regs ) ;
break ;
}
# ifdef __LITTLE_ENDIAN__
/*
* An LE kernel stores the vector in the task struct as an LE
* byte array ( effectively swapping both the components and
* the content of the components ) . Those instructions expect
* the components to remain in ascending address order , so we
* swap them back .
*
* If we are running a BE user space , the expectation is that
* of a simple memcpy , so forcing the emulation to look like
* a lxvb16x should do the trick .
*/
if ( swap )
sel = 3 ;
switch ( sel ) {
case 0 : /* lxvw4x */
for ( i = 0 ; i < 4 ; i + + )
( ( u32 * ) vdst ) [ i ] = ( ( u32 * ) vbuf ) [ 3 - i ] ;
break ;
case 1 : /* lxvh8x */
for ( i = 0 ; i < 8 ; i + + )
( ( u16 * ) vdst ) [ i ] = ( ( u16 * ) vbuf ) [ 7 - i ] ;
break ;
case 2 : /* lxvd2x */
for ( i = 0 ; i < 2 ; i + + )
( ( u64 * ) vdst ) [ i ] = ( ( u64 * ) vbuf ) [ 1 - i ] ;
break ;
case 3 : /* lxvb16x */
for ( i = 0 ; i < 16 ; i + + )
vdst [ i ] = vbuf [ 15 - i ] ;
break ;
}
# else /* __LITTLE_ENDIAN__ */
/* On a big endian kernel, a BE userspace only needs a memcpy */
if ( ! swap )
sel = 3 ;
/* Otherwise, we need to swap the content of the components */
switch ( sel ) {
case 0 : /* lxvw4x */
for ( i = 0 ; i < 4 ; i + + )
( ( u32 * ) vdst ) [ i ] = cpu_to_le32 ( ( ( u32 * ) vbuf ) [ i ] ) ;
break ;
case 1 : /* lxvh8x */
for ( i = 0 ; i < 8 ; i + + )
( ( u16 * ) vdst ) [ i ] = cpu_to_le16 ( ( ( u16 * ) vbuf ) [ i ] ) ;
break ;
case 2 : /* lxvd2x */
for ( i = 0 ; i < 2 ; i + + )
( ( u64 * ) vdst ) [ i ] = cpu_to_le64 ( ( ( u64 * ) vbuf ) [ i ] ) ;
break ;
case 3 : /* lxvb16x */
memcpy ( vdst , vbuf , 16 ) ;
break ;
}
# endif /* !__LITTLE_ENDIAN__ */
/* Go to next instruction */
2021-06-18 01:51:03 +10:00
regs_add_return_ip ( regs , 4 ) ;
2017-09-15 15:25:48 +10:00
}
# endif /* CONFIG_VSX */
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER_ASYNC ( handle_hmi_exception )
2014-07-29 18:40:01 +05:30
{
struct pt_regs * old_regs ;
old_regs = set_irq_regs ( regs ) ;
2017-09-15 15:25:48 +10:00
# ifdef CONFIG_VSX
/* Real mode flagged P9 special emu is needed */
if ( local_paca - > hmi_p9_special_emu ) {
local_paca - > hmi_p9_special_emu = 0 ;
/*
* We don ' t want to take page faults while doing the
* emulation , we just replay the instruction if necessary .
*/
pagefault_disable ( ) ;
p9_hmi_special_emu ( regs ) ;
pagefault_enable ( ) ;
}
# endif /* CONFIG_VSX */
2014-07-29 18:40:01 +05:30
if ( ppc_md . handle_hmi_exception )
ppc_md . handle_hmi_exception ( regs ) ;
set_irq_regs ( old_regs ) ;
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( unknown_exception )
2005-09-26 16:04:21 +10:00
{
printk ( " Bad trap at PC: %lx, SR: %lx, vector=%lx \n " ,
regs - > nip , regs - > msr , regs - > trap ) ;
2018-04-17 17:10:34 -05:00
_exception ( SIGTRAP , regs , TRAP_UNK , 0 ) ;
2021-01-30 23:08:31 +10:00
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER_ASYNC ( unknown_async_exception )
2021-01-30 23:08:31 +10:00
{
printk ( " Bad trap at PC: %lx, SR: %lx, vector=%lx \n " ,
regs - > nip , regs - > msr , regs - > trap ) ;
_exception ( SIGTRAP , regs , TRAP_UNK , 0 ) ;
2005-09-26 16:04:21 +10:00
}
2021-03-16 20:41:59 +10:00
DEFINE_INTERRUPT_HANDLER_NMI ( unknown_nmi_exception )
{
printk ( " Bad trap at PC: %lx, SR: %lx, vector=%lx \n " ,
regs - > nip , regs - > msr , regs - > trap ) ;
_exception ( SIGTRAP , regs , TRAP_UNK , 0 ) ;
return 0 ;
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( instruction_breakpoint_exception )
2005-09-26 16:04:21 +10:00
{
if ( notify_die ( DIE_IABR_MATCH , " iabr_match " , regs , 5 ,
5 , SIGTRAP ) = = NOTIFY_STOP )
2021-01-30 23:08:42 +10:00
return ;
2005-09-26 16:04:21 +10:00
if ( debugger_iabr_match ( regs ) )
2021-01-30 23:08:42 +10:00
return ;
2005-09-26 16:04:21 +10:00
_exception ( SIGTRAP , regs , TRAP_BRKPT , regs - > nip ) ;
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( RunModeException )
2005-09-26 16:04:21 +10:00
{
2018-04-17 17:10:34 -05:00
_exception ( SIGTRAP , regs , TRAP_UNK , 0 ) ;
2005-09-26 16:04:21 +10:00
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( single_step_exception )
2005-09-26 16:04:21 +10:00
{
2010-06-15 11:35:31 +05:30
clear_single_step ( regs ) ;
2018-03-26 17:55:21 +01:00
clear_br_trace ( regs ) ;
2005-09-26 16:04:21 +10:00
2016-11-21 22:36:41 +05:30
if ( kprobe_post_handler ( regs ) )
return ;
2005-09-26 16:04:21 +10:00
if ( notify_die ( DIE_SSTEP , " single_step " , regs , 5 ,
5 , SIGTRAP ) = = NOTIFY_STOP )
2021-01-30 23:08:42 +10:00
return ;
2005-09-26 16:04:21 +10:00
if ( debugger_sstep ( regs ) )
2021-01-30 23:08:42 +10:00
return ;
2005-09-26 16:04:21 +10:00
_exception ( SIGTRAP , regs , TRAP_TRACE , regs - > nip ) ;
}
/*
* After we have successfully emulated an instruction , we have to
* check if the instruction was being single - stepped , and if so ,
* pretend we got a single - step exception . This was pointed out
* by Kumar Gala . - - paulus
*/
2005-10-06 13:27:05 +10:00
static void emulate_single_step ( struct pt_regs * regs )
2005-09-26 16:04:21 +10:00
{
2010-06-15 11:35:31 +05:30
if ( single_stepping ( regs ) )
single_step_exception ( regs ) ;
2005-09-26 16:04:21 +10:00
}
2007-02-07 01:47:59 -06:00
static inline int __parse_fpscr ( unsigned long fpscr )
2005-10-01 18:43:42 +10:00
{
2018-04-17 15:30:54 -05:00
int ret = FPE_FLTUNK ;
2005-10-01 18:43:42 +10:00
/* Invalid operation */
if ( ( fpscr & FPSCR_VE ) & & ( fpscr & FPSCR_VX ) )
2007-02-07 01:47:59 -06:00
ret = FPE_FLTINV ;
2005-10-01 18:43:42 +10:00
/* Overflow */
else if ( ( fpscr & FPSCR_OE ) & & ( fpscr & FPSCR_OX ) )
2007-02-07 01:47:59 -06:00
ret = FPE_FLTOVF ;
2005-10-01 18:43:42 +10:00
/* Underflow */
else if ( ( fpscr & FPSCR_UE ) & & ( fpscr & FPSCR_UX ) )
2007-02-07 01:47:59 -06:00
ret = FPE_FLTUND ;
2005-10-01 18:43:42 +10:00
/* Divide by zero */
else if ( ( fpscr & FPSCR_ZE ) & & ( fpscr & FPSCR_ZX ) )
2007-02-07 01:47:59 -06:00
ret = FPE_FLTDIV ;
2005-10-01 18:43:42 +10:00
/* Inexact result */
else if ( ( fpscr & FPSCR_XE ) & & ( fpscr & FPSCR_XX ) )
2007-02-07 01:47:59 -06:00
ret = FPE_FLTRES ;
return ret ;
}
static void parse_fpe ( struct pt_regs * regs )
{
int code = 0 ;
flush_fp_to_thread ( current ) ;
2020-08-18 17:19:17 +00:00
# ifdef CONFIG_PPC_FPU_REGS
2013-09-10 20:20:42 +10:00
code = __parse_fpscr ( current - > thread . fp_state . fpscr ) ;
2020-08-18 17:19:17 +00:00
# endif
2005-10-01 18:43:42 +10:00
_exception ( SIGFPE , regs , code , regs - > nip ) ;
}
/*
* Illegal instruction emulation support . Originally written to
2005-09-26 16:04:21 +10:00
* provide the PVR to user applications using the mfspr rd , PVR .
* Return non - zero if we can ' t emulate , or - EFAULT if the associated
* memory access caused an access fault . Return zero on success .
*
* There are a couple of ways to do this , either " decode " the instruction
* or directly match lots of bits . In this case , matching lots of
* bits is faster and easier .
2005-10-10 22:37:57 +10:00
*
2005-09-26 16:04:21 +10:00
*/
static int emulate_string_inst ( struct pt_regs * regs , u32 instword )
{
u8 rT = ( instword > > 21 ) & 0x1f ;
u8 rA = ( instword > > 16 ) & 0x1f ;
u8 NB_RB = ( instword > > 11 ) & 0x1f ;
u32 num_bytes ;
unsigned long EA ;
int pos = 0 ;
/* Early out if we are an invalid form of lswx */
2009-02-10 20:10:44 +00:00
if ( ( instword & PPC_INST_STRING_MASK ) = = PPC_INST_LSWX )
2005-09-26 16:04:21 +10:00
if ( ( rT = = rA ) | | ( rT = = NB_RB ) )
return - EINVAL ;
EA = ( rA = = 0 ) ? 0 : regs - > gpr [ rA ] ;
2009-02-10 20:10:44 +00:00
switch ( instword & PPC_INST_STRING_MASK ) {
case PPC_INST_LSWX :
case PPC_INST_STSWX :
2005-09-26 16:04:21 +10:00
EA + = NB_RB ;
num_bytes = regs - > xer & 0x7f ;
break ;
2009-02-10 20:10:44 +00:00
case PPC_INST_LSWI :
case PPC_INST_STSWI :
2005-09-26 16:04:21 +10:00
num_bytes = ( NB_RB = = 0 ) ? 32 : NB_RB ;
break ;
default :
return - EINVAL ;
}
while ( num_bytes ! = 0 )
{
u8 val ;
u32 shift = 8 * ( 3 - ( pos & 0x3 ) ) ;
2013-06-25 11:41:05 -05:00
/* if process is 32-bit, clear upper 32 bits of EA */
if ( ( regs - > msr & MSR_64BIT ) = = 0 )
EA & = 0xFFFFFFFF ;
2009-02-10 20:10:44 +00:00
switch ( ( instword & PPC_INST_STRING_MASK ) ) {
case PPC_INST_LSWX :
case PPC_INST_LSWI :
2005-09-26 16:04:21 +10:00
if ( get_user ( val , ( u8 __user * ) EA ) )
return - EFAULT ;
/* first time updating this reg,
* zero it out */
if ( pos = = 0 )
regs - > gpr [ rT ] = 0 ;
regs - > gpr [ rT ] | = val < < shift ;
break ;
2009-02-10 20:10:44 +00:00
case PPC_INST_STSWI :
case PPC_INST_STSWX :
2005-09-26 16:04:21 +10:00
val = regs - > gpr [ rT ] > > shift ;
if ( put_user ( val , ( u8 __user * ) EA ) )
return - EFAULT ;
break ;
}
/* move EA to next address */
EA + = 1 ;
num_bytes - - ;
/* manage our position within the register */
if ( + + pos = = 4 ) {
pos = 0 ;
if ( + + rT = = 32 )
rT = 0 ;
}
}
return 0 ;
}
2006-08-30 13:11:38 -05:00
static int emulate_popcntb_inst ( struct pt_regs * regs , u32 instword )
{
u32 ra , rs ;
unsigned long tmp ;
ra = ( instword > > 16 ) & 0x1f ;
rs = ( instword > > 21 ) & 0x1f ;
tmp = regs - > gpr [ rs ] ;
tmp = tmp - ( ( tmp > > 1 ) & 0x5555555555555555ULL ) ;
tmp = ( tmp & 0x3333333333333333ULL ) + ( ( tmp > > 2 ) & 0x3333333333333333ULL ) ;
tmp = ( tmp + ( tmp > > 4 ) ) & 0x0f0f0f0f0f0f0f0fULL ;
regs - > gpr [ ra ] = tmp ;
return 0 ;
}
2007-11-19 21:35:29 -06:00
static int emulate_isel ( struct pt_regs * regs , u32 instword )
{
u8 rT = ( instword > > 21 ) & 0x1f ;
u8 rA = ( instword > > 16 ) & 0x1f ;
u8 rB = ( instword > > 11 ) & 0x1f ;
u8 BC = ( instword > > 6 ) & 0x1f ;
u8 bit ;
unsigned long tmp ;
tmp = ( rA = = 0 ) ? 0 : regs - > gpr [ rA ] ;
bit = ( regs - > ccr > > ( 31 - BC ) ) & 0x1 ;
regs - > gpr [ rT ] = bit ? tmp : regs - > gpr [ rB ] ;
return 0 ;
}
2013-05-26 18:09:39 +00:00
# ifdef CONFIG_PPC_TRANSACTIONAL_MEM
static inline bool tm_abort_check ( struct pt_regs * regs , int cause )
{
/* If we're emulating a load/store in an active transaction, we cannot
* emulate it as the kernel operates in transaction suspended context .
* We need to abort the transaction . This creates a persistent TM
* abort so tell the user what caused it with a new code .
*/
if ( MSR_TM_TRANSACTIONAL ( regs - > msr ) ) {
tm_enable ( ) ;
tm_abort ( cause ) ;
return true ;
}
return false ;
}
# else
static inline bool tm_abort_check ( struct pt_regs * regs , int reason )
{
return false ;
}
# endif
2005-09-26 16:04:21 +10:00
static int emulate_instruction ( struct pt_regs * regs )
{
u32 instword ;
u32 rd ;
2013-08-07 02:01:47 +10:00
if ( ! user_mode ( regs ) )
2005-09-26 16:04:21 +10:00
return - EINVAL ;
if ( get_user ( instword , ( u32 __user * ) ( regs - > nip ) ) )
return - EFAULT ;
/* Emulate the mfspr rD, PVR. */
2009-02-10 20:10:44 +00:00
if ( ( instword & PPC_INST_MFSPR_PVR_MASK ) = = PPC_INST_MFSPR_PVR ) {
2009-10-27 18:46:55 +00:00
PPC_WARN_EMULATED ( mfpvr , regs ) ;
2005-09-26 16:04:21 +10:00
rd = ( instword > > 21 ) & 0x1f ;
regs - > gpr [ rd ] = mfspr ( SPRN_PVR ) ;
return 0 ;
}
/* Emulating the dcba insn is just a no-op. */
2009-05-18 02:10:05 +00:00
if ( ( instword & PPC_INST_DCBA_MASK ) = = PPC_INST_DCBA ) {
2009-10-27 18:46:55 +00:00
PPC_WARN_EMULATED ( dcba , regs ) ;
2005-09-26 16:04:21 +10:00
return 0 ;
2009-05-18 02:10:05 +00:00
}
2005-09-26 16:04:21 +10:00
/* Emulate the mcrxr insn. */
2009-02-10 20:10:44 +00:00
if ( ( instword & PPC_INST_MCRXR_MASK ) = = PPC_INST_MCRXR ) {
2005-10-10 22:37:57 +10:00
int shift = ( instword > > 21 ) & 0x1c ;
2005-09-26 16:04:21 +10:00
unsigned long msk = 0xf0000000UL > > shift ;
2009-10-27 18:46:55 +00:00
PPC_WARN_EMULATED ( mcrxr , regs ) ;
2005-09-26 16:04:21 +10:00
regs - > ccr = ( regs - > ccr & ~ msk ) | ( ( regs - > xer > > shift ) & msk ) ;
regs - > xer & = ~ 0xf0000000UL ;
return 0 ;
}
/* Emulate load/store string insn. */
2009-05-18 02:10:05 +00:00
if ( ( instword & PPC_INST_STRING_GEN_MASK ) = = PPC_INST_STRING ) {
2013-05-26 18:09:39 +00:00
if ( tm_abort_check ( regs ,
TM_CAUSE_EMULATE | TM_CAUSE_PERSISTENT ) )
return - EINVAL ;
2009-10-27 18:46:55 +00:00
PPC_WARN_EMULATED ( string , regs ) ;
2005-09-26 16:04:21 +10:00
return emulate_string_inst ( regs , instword ) ;
2009-05-18 02:10:05 +00:00
}
2005-09-26 16:04:21 +10:00
2006-08-30 13:11:38 -05:00
/* Emulate the popcntb (Population Count Bytes) instruction. */
2009-02-10 20:10:44 +00:00
if ( ( instword & PPC_INST_POPCNTB_MASK ) = = PPC_INST_POPCNTB ) {
2009-10-27 18:46:55 +00:00
PPC_WARN_EMULATED ( popcntb , regs ) ;
2006-08-30 13:11:38 -05:00
return emulate_popcntb_inst ( regs , instword ) ;
}
2007-11-19 21:35:29 -06:00
/* Emulate isel (Integer Select) instruction */
2009-02-10 20:10:44 +00:00
if ( ( instword & PPC_INST_ISEL_MASK ) = = PPC_INST_ISEL ) {
2009-10-27 18:46:55 +00:00
PPC_WARN_EMULATED ( isel , regs ) ;
2007-11-19 21:35:29 -06:00
return emulate_isel ( regs , instword ) ;
}
2013-07-03 16:26:47 -05:00
/* Emulate sync instruction variants */
if ( ( instword & PPC_INST_SYNC_MASK ) = = PPC_INST_SYNC ) {
PPC_WARN_EMULATED ( sync , regs ) ;
asm volatile ( " sync " ) ;
return 0 ;
}
2011-03-02 15:18:48 +00:00
# ifdef CONFIG_PPC64
/* Emulate the mfspr rD, DSCR. */
2013-05-01 20:06:33 +00:00
if ( ( ( ( instword & PPC_INST_MFSPR_DSCR_USER_MASK ) = =
PPC_INST_MFSPR_DSCR_USER ) | |
( ( instword & PPC_INST_MFSPR_DSCR_MASK ) = =
PPC_INST_MFSPR_DSCR ) ) & &
2011-03-02 15:18:48 +00:00
cpu_has_feature ( CPU_FTR_DSCR ) ) {
PPC_WARN_EMULATED ( mfdscr , regs ) ;
rd = ( instword > > 21 ) & 0x1f ;
regs - > gpr [ rd ] = mfspr ( SPRN_DSCR ) ;
return 0 ;
}
/* Emulate the mtspr DSCR, rD. */
2013-05-01 20:06:33 +00:00
if ( ( ( ( instword & PPC_INST_MTSPR_DSCR_USER_MASK ) = =
PPC_INST_MTSPR_DSCR_USER ) | |
( ( instword & PPC_INST_MTSPR_DSCR_MASK ) = =
PPC_INST_MTSPR_DSCR ) ) & &
2011-03-02 15:18:48 +00:00
cpu_has_feature ( CPU_FTR_DSCR ) ) {
PPC_WARN_EMULATED ( mtdscr , regs ) ;
rd = ( instword > > 21 ) & 0x1f ;
2012-09-03 16:48:46 +00:00
current - > thread . dscr = regs - > gpr [ rd ] ;
2011-03-02 15:18:48 +00:00
current - > thread . dscr_inherit = 1 ;
2012-09-03 16:48:46 +00:00
mtspr ( SPRN_DSCR , current - > thread . dscr ) ;
2011-03-02 15:18:48 +00:00
return 0 ;
}
# endif
2005-09-26 16:04:21 +10:00
return - EINVAL ;
}
2006-12-08 03:30:41 -08:00
int is_valid_bugaddr ( unsigned long addr )
2005-09-26 16:04:21 +10:00
{
2006-12-08 03:30:41 -08:00
return is_kernel_addr ( addr ) ;
2005-09-26 16:04:21 +10:00
}
2013-07-14 16:40:07 +08:00
# ifdef CONFIG_MATH_EMULATION
static int emulate_math ( struct pt_regs * regs )
{
int ret ;
ret = do_mathemu ( regs ) ;
if ( ret > = 0 )
PPC_WARN_EMULATED ( math , regs ) ;
switch ( ret ) {
case 0 :
emulate_single_step ( regs ) ;
return 0 ;
case 1 : {
int code = 0 ;
2013-09-10 20:20:42 +10:00
code = __parse_fpscr ( current - > thread . fp_state . fpscr ) ;
2013-07-14 16:40:07 +08:00
_exception ( SIGFPE , regs , code , regs - > nip ) ;
return 0 ;
}
case - EFAULT :
_exception ( SIGSEGV , regs , SEGV_MAPERR , regs - > nip ) ;
return 0 ;
}
return - 1 ;
}
# else
static inline int emulate_math ( struct pt_regs * regs ) { return - 1 ; }
# endif
2021-02-07 22:56:43 +10:00
static void do_program_check ( struct pt_regs * regs )
2005-09-26 16:04:21 +10:00
{
unsigned int reason = get_reason ( regs ) ;
2006-12-08 02:43:30 -06:00
/* We can now get here via a FP Unavailable exception if the core
2007-02-07 01:13:32 -06:00
* has no FPU , in that case the reason flags will be 0 */
2005-09-26 16:04:21 +10:00
2005-10-01 18:43:42 +10:00
if ( reason & REASON_FP ) {
/* IEEE FP exception */
parse_fpe ( regs ) ;
2021-02-07 22:56:43 +10:00
return ;
2005-10-06 13:27:05 +10:00
}
if ( reason & REASON_TRAP ) {
2016-02-18 13:48:01 +11:00
unsigned long bugaddr ;
2010-05-20 21:04:25 -05:00
/* Debugger is first in line to stop recursive faults in
* rcu_lock , notify_die , or atomic_notifier_call_chain */
if ( debugger_bpt ( regs ) )
2021-02-07 22:56:43 +10:00
return ;
2010-05-20 21:04:25 -05:00
2016-11-21 22:36:41 +05:30
if ( kprobe_handler ( regs ) )
2021-02-07 22:56:43 +10:00
return ;
2016-11-21 22:36:41 +05:30
2005-09-26 16:04:21 +10:00
/* trap exception */
2005-10-01 18:43:42 +10:00
if ( notify_die ( DIE_BPT , " breakpoint " , regs , 5 , 5 , SIGTRAP )
= = NOTIFY_STOP )
2021-02-07 22:56:43 +10:00
return ;
2006-12-08 03:30:41 -08:00
2016-02-18 13:48:01 +11:00
bugaddr = regs - > nip ;
/*
* Fixup bugaddr for BUG_ON ( ) in real mode
*/
if ( ! is_kernel_addr ( bugaddr ) & & ! ( regs - > msr & MSR_IR ) )
bugaddr + = PAGE_OFFSET ;
2006-12-08 03:30:41 -08:00
if ( ! ( regs - > msr & MSR_PR ) & & /* not user-mode */
2016-02-18 13:48:01 +11:00
report_bug ( bugaddr , regs ) = = BUG_TRAP_TYPE_WARN ) {
2021-06-18 01:51:03 +10:00
regs_add_return_ip ( regs , 4 ) ;
2021-02-07 22:56:43 +10:00
return ;
2005-09-26 16:04:21 +10:00
}
2005-10-06 13:27:05 +10:00
_exception ( SIGTRAP , regs , TRAP_BRKPT , regs - > nip ) ;
2021-02-07 22:56:43 +10:00
return ;
2005-10-06 13:27:05 +10:00
}
2013-02-13 16:21:40 +00:00
# ifdef CONFIG_PPC_TRANSACTIONAL_MEM
if ( reason & REASON_TM ) {
/* This is a TM "Bad Thing Exception" program check.
* This occurs when :
* - An rfid / hrfid / mtmsrd attempts to cause an illegal
* transition in TM states .
* - A trechkpt is attempted when transactional .
* - A treclaim is attempted when non transactional .
* - A tend is illegally attempted .
* - writing a TM SPR when transactional .
2017-10-12 15:45:25 +11:00
*
* If usermode caused this , it ' s done something illegal and
2013-02-13 16:21:40 +00:00
* gets a SIGILL slap on the wrist . We call it an illegal
* operand to distinguish from the instruction just being bad
* ( e . g . executing a ' tend ' on a CPU without TM ! ) ; it ' s an
* illegal / placement / of a valid instruction .
*/
if ( user_mode ( regs ) ) {
_exception ( SIGILL , regs , ILL_ILLOPN , regs - > nip ) ;
2021-02-07 22:56:43 +10:00
return ;
2013-02-13 16:21:40 +00:00
} else {
printk ( KERN_EMERG " Unexpected TM Bad Thing exception "
2018-11-26 18:11:59 -02:00
" at %lx (msr 0x%lx) tm_scratch=%llx \n " ,
regs - > nip , regs - > msr , get_paca ( ) - > tm_scratch ) ;
2013-02-13 16:21:40 +00:00
die ( " Unrecoverable exception " , regs , SIGABRT ) ;
}
}
# endif
2005-10-06 13:27:05 +10:00
2013-08-15 15:22:19 +10:00
/*
* If we took the program check in the kernel skip down to sending a
* SIGILL . The subsequent cases all relate to emulating instructions
* which we should only do for userspace . We also do not want to enable
* interrupts for kernel faults because that might lead to further
* faults , and loose the context of the original exception .
*/
if ( ! user_mode ( regs ) )
goto sigill ;
2021-01-30 23:08:39 +10:00
interrupt_cond_local_irq_enable ( regs ) ;
2006-03-03 17:11:40 +11:00
2007-02-07 01:13:32 -06:00
/* (reason & REASON_ILLEGAL) would be the obvious thing here,
* but there seems to be a hardware bug on the 405 GP ( RevD )
* that means ESR is sometimes set incorrectly - either to
* ESR_DST ( ! ? ) or 0. In the process of chasing this with the
* hardware people - not sure if it can happen on any illegal
* instruction or only on FP instructions , whether there is a
2013-06-09 17:01:24 +10:00
* pattern to occurrences etc . - dgibson 31 / Mar / 2003
*/
2013-07-14 16:40:07 +08:00
if ( ! emulate_math ( regs ) )
2021-02-07 22:56:43 +10:00
return ;
2007-02-07 01:13:32 -06:00
2005-10-06 13:27:05 +10:00
/* Try to emulate it if we should. */
if ( reason & ( REASON_ILLEGAL | REASON_PRIVILEGED ) ) {
2005-09-26 16:04:21 +10:00
switch ( emulate_instruction ( regs ) ) {
case 0 :
2021-06-18 01:51:03 +10:00
regs_add_return_ip ( regs , 4 ) ;
2005-09-26 16:04:21 +10:00
emulate_single_step ( regs ) ;
2021-02-07 22:56:43 +10:00
return ;
2005-09-26 16:04:21 +10:00
case - EFAULT :
_exception ( SIGSEGV , regs , SEGV_MAPERR , regs - > nip ) ;
2021-02-07 22:56:43 +10:00
return ;
2005-09-26 16:04:21 +10:00
}
}
2005-10-06 13:27:05 +10:00
2013-08-15 15:22:19 +10:00
sigill :
2005-10-06 13:27:05 +10:00
if ( reason & REASON_PRIVILEGED )
_exception ( SIGILL , regs , ILL_PRVOPC , regs - > nip ) ;
else
_exception ( SIGILL , regs , ILL_ILLOPC , regs - > nip ) ;
powerpc: Exception hooks for context tracking subsystem
This is the exception hooks for context tracking subsystem, including
data access, program check, single step, instruction breakpoint, machine check,
alignment, fp unavailable, altivec assist, unknown exception, whose handlers
might use RCU.
This patch corresponds to
[PATCH] x86: Exception hooks for userspace RCU extended QS
commit 6ba3c97a38803883c2eee489505796cb0a727122
But after the exception handling moved to generic code, and some changes in
following two commits:
56dd9470d7c8734f055da2a6bac553caf4a468eb
context_tracking: Move exception handling to generic code
6c1e0256fad84a843d915414e4b5973b7443d48d
context_tracking: Restore correct previous context state on exception exit
it is able for exception hooks to use the generic code above instead of a
redundant arch implementation.
Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-05-13 16:16:41 +00:00
2021-02-07 22:56:43 +10:00
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( program_check_exception )
2021-02-07 22:56:43 +10:00
{
do_program_check ( regs ) ;
2005-09-26 16:04:21 +10:00
}
2013-06-14 20:07:41 +10:00
/*
* This occurs when running in hypervisor mode on POWER6 or later
* and an illegal instruction is encountered .
*/
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( emulation_assist_interrupt )
2013-06-14 20:07:41 +10:00
{
2021-06-18 01:51:03 +10:00
regs_set_return_msr ( regs , regs - > msr | REASON_ILLEGAL ) ;
2021-02-07 22:56:43 +10:00
do_program_check ( regs ) ;
2013-06-14 20:07:41 +10:00
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( alignment_exception )
2005-09-26 16:04:21 +10:00
{
2006-11-01 15:11:39 +11:00
int sig , code , fixed = 0 ;
2020-05-06 13:40:48 +10:00
unsigned long reason ;
2005-09-26 16:04:21 +10:00
2021-01-30 23:08:39 +10:00
interrupt_cond_local_irq_enable ( regs ) ;
2012-05-08 13:38:50 +10:00
2020-05-06 13:40:48 +10:00
reason = get_reason ( regs ) ;
if ( reason & REASON_BOUNDARY ) {
sig = SIGBUS ;
code = BUS_ADRALN ;
goto bad ;
}
2013-05-26 18:09:39 +00:00
if ( tm_abort_check ( regs , TM_CAUSE_ALIGNMENT | TM_CAUSE_PERSISTENT ) )
2021-01-30 23:08:42 +10:00
return ;
2013-05-26 18:09:39 +00:00
2006-06-07 16:15:39 +10:00
/* we don't implement logging of alignment exceptions */
if ( ! ( current - > thread . align_ctl & PR_UNALIGN_SIGBUS ) )
fixed = fix_alignment ( regs ) ;
2005-09-26 16:04:21 +10:00
if ( fixed = = 1 ) {
2020-05-06 13:40:48 +10:00
/* skip over emulated instruction */
2021-06-18 01:51:03 +10:00
regs_add_return_ip ( regs , inst_length ( reason ) ) ;
2005-09-26 16:04:21 +10:00
emulate_single_step ( regs ) ;
2021-01-30 23:08:42 +10:00
return ;
2005-09-26 16:04:21 +10:00
}
2005-10-01 18:43:42 +10:00
/* Operand address was bad */
2005-09-26 16:04:21 +10:00
if ( fixed = = - EFAULT ) {
2006-11-01 15:11:39 +11:00
sig = SIGSEGV ;
code = SEGV_ACCERR ;
} else {
sig = SIGBUS ;
code = BUS_ADRALN ;
2005-09-26 16:04:21 +10:00
}
2020-05-06 13:40:48 +10:00
bad :
2006-11-01 15:11:39 +11:00
if ( user_mode ( regs ) )
_exception ( sig , regs , code , regs - > dar ) ;
else
2021-01-30 23:08:21 +10:00
bad_page_fault ( regs , sig ) ;
2005-09-26 16:04:21 +10:00
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( stack_overflow_exception )
powerpc/32: Add early stack overflow detection with VMAP stack.
To avoid recursive faults, stack overflow detection has to be
performed before writing in the stack in exception prologs.
Do it by checking the alignment. If the stack pointer alignment is
wrong, it means it is pointing to the following or preceding page.
Without VMAP stack, a stack overflow is catastrophic. With VMAP
stack, a stack overflow isn't destructive, so don't panic. Kill
the task with SIGSEGV instead.
A dedicated overflow stack is set up for each CPU.
lkdtm: Performing direct entry EXHAUST_STACK
lkdtm: Calling function with 512 frame size to depth 32 ...
lkdtm: loop 32/32 ...
lkdtm: loop 31/32 ...
lkdtm: loop 30/32 ...
lkdtm: loop 29/32 ...
lkdtm: loop 28/32 ...
lkdtm: loop 27/32 ...
lkdtm: loop 26/32 ...
lkdtm: loop 25/32 ...
lkdtm: loop 24/32 ...
lkdtm: loop 23/32 ...
lkdtm: loop 22/32 ...
lkdtm: loop 21/32 ...
lkdtm: loop 20/32 ...
Kernel stack overflow in process test[359], r1=c900c008
Oops: Kernel stack overflow, sig: 6 [#1]
BE PAGE_SIZE=4K MMU=Hash PowerMac
Modules linked in:
CPU: 0 PID: 359 Comm: test Not tainted 5.3.0-rc7+ #2225
NIP: c0622060 LR: c0626710 CTR: 00000000
REGS: c0895f48 TRAP: 0000 Not tainted (5.3.0-rc7+)
MSR: 00001032 <ME,IR,DR,RI> CR: 28004224 XER: 00000000
GPR00: c0626ca4 c900c008 c783c000 c07335cc c900c010 c07335cc c900c0f0 c07335cc
GPR08: c900c0f0 00000001 00000000 00000000 28008222 00000000 00000000 00000000
GPR16: 00000000 00000000 10010128 10010000 b799c245 10010158 c07335cc 00000025
GPR24: c0690000 c08b91d4 c068f688 00000020 c900c0f0 c068f668 c08b95b4 c08b91d4
NIP [c0622060] format_decode+0x0/0x4d4
LR [c0626710] vsnprintf+0x80/0x5fc
Call Trace:
[c900c068] [c0626ca4] vscnprintf+0x18/0x48
[c900c078] [c007b944] vprintk_store+0x40/0x214
[c900c0b8] [c007bf50] vprintk_emit+0x90/0x1dc
[c900c0e8] [c007c5cc] printk+0x50/0x60
[c900c128] [c03da5b0] recursive_loop+0x44/0x6c
[c900c338] [c03da5c4] recursive_loop+0x58/0x6c
[c900c548] [c03da5c4] recursive_loop+0x58/0x6c
[c900c758] [c03da5c4] recursive_loop+0x58/0x6c
[c900c968] [c03da5c4] recursive_loop+0x58/0x6c
[c900cb78] [c03da5c4] recursive_loop+0x58/0x6c
[c900cd88] [c03da5c4] recursive_loop+0x58/0x6c
[c900cf98] [c03da5c4] recursive_loop+0x58/0x6c
[c900d1a8] [c03da5c4] recursive_loop+0x58/0x6c
[c900d3b8] [c03da5c4] recursive_loop+0x58/0x6c
[c900d5c8] [c03da5c4] recursive_loop+0x58/0x6c
[c900d7d8] [c03da5c4] recursive_loop+0x58/0x6c
[c900d9e8] [c03da5c4] recursive_loop+0x58/0x6c
[c900dbf8] [c03da5c4] recursive_loop+0x58/0x6c
[c900de08] [c03da67c] lkdtm_EXHAUST_STACK+0x30/0x4c
[c900de18] [c03da3e8] direct_entry+0xc8/0x140
[c900de48] [c029fb40] full_proxy_write+0x64/0xcc
[c900de68] [c01500f8] __vfs_write+0x30/0x1d0
[c900dee8] [c0152cb8] vfs_write+0xb8/0x1d4
[c900df08] [c0152f7c] ksys_write+0x58/0xe8
[c900df38] [c0014208] ret_from_syscall+0x0/0x34
--- interrupt: c01 at 0xf806664
LR = 0x1000c868
Instruction dump:
4bffff91 80010014 7c832378 7c0803a6 38210010 4e800020 3d20c08a 3ca0c089
8089a0cc 38a58f0c 38600001 4ba2d494 <9421ffe0> 7c0802a6 bfc10018 7c9f2378
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1b89c121b4070c7ee99e4f22cc178f15a736b07b.1576916812.git.christophe.leroy@c-s.fr
2019-12-21 08:32:29 +00:00
{
die ( " Kernel stack overflow " , regs , SIGSEGV ) ;
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( kernel_fp_unavailable_exception )
2005-10-01 18:43:42 +10:00
{
printk ( KERN_EMERG " Unrecoverable FP Unavailable Exception "
" %lx at %lx \n " , regs - > trap , regs - > nip ) ;
die ( " Unrecoverable FP Unavailable Exception " , regs , SIGABRT ) ;
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( altivec_unavailable_exception )
2005-10-01 18:43:42 +10:00
{
if ( user_mode ( regs ) ) {
/* A user program has executed an altivec instruction,
but this kernel doesn ' t support altivec . */
_exception ( SIGILL , regs , ILL_ILLOPC , regs - > nip ) ;
2021-01-30 23:08:42 +10:00
return ;
2005-10-01 18:43:42 +10:00
}
2006-10-13 11:41:00 +10:00
2005-10-01 18:43:42 +10:00
printk ( KERN_EMERG " Unrecoverable VMX/Altivec Unavailable Exception "
" %lx at %lx \n " , regs - > trap , regs - > nip ) ;
die ( " Unrecoverable VMX/Altivec Unavailable Exception " , regs , SIGABRT ) ;
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( vsx_unavailable_exception )
2008-06-25 14:07:18 +10:00
{
if ( user_mode ( regs ) ) {
/* A user program has executed an vsx instruction,
but this kernel doesn ' t support vsx . */
_exception ( SIGILL , regs , ILL_ILLOPC , regs - > nip ) ;
return ;
}
printk ( KERN_EMERG " Unrecoverable VSX Unavailable Exception "
" %lx at %lx \n " , regs - > trap , regs - > nip ) ;
die ( " Unrecoverable VSX Unavailable Exception " , regs , SIGABRT ) ;
}
2013-08-09 17:29:29 +10:00
# ifdef CONFIG_PPC64
2016-09-14 18:02:15 +10:00
static void tm_unavailable ( struct pt_regs * regs )
{
2016-09-14 18:02:16 +10:00
# ifdef CONFIG_PPC_TRANSACTIONAL_MEM
if ( user_mode ( regs ) ) {
current - > thread . load_tm + + ;
2021-06-18 01:51:03 +10:00
regs_set_return_msr ( regs , regs - > msr | MSR_TM ) ;
2016-09-14 18:02:16 +10:00
tm_enable ( ) ;
tm_restore_sprs ( & current - > thread ) ;
return ;
}
# endif
2016-09-14 18:02:15 +10:00
pr_emerg ( " Unrecoverable TM Unavailable Exception "
" %lx at %lx \n " , regs - > trap , regs - > nip ) ;
die ( " Unrecoverable TM Unavailable Exception " , regs , SIGABRT ) ;
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( facility_unavailable_exception )
2013-02-13 16:21:38 +00:00
{
2013-06-25 17:47:56 +10:00
static char * facility_strings [ ] = {
2013-08-09 17:29:29 +10:00
[ FSCR_FP_LG ] = " FPU " ,
[ FSCR_VECVSX_LG ] = " VMX/VSX " ,
[ FSCR_DSCR_LG ] = " DSCR " ,
[ FSCR_PM_LG ] = " PMU SPRs " ,
[ FSCR_BHRB_LG ] = " BHRB " ,
[ FSCR_TM_LG ] = " TM " ,
[ FSCR_EBB_LG ] = " EBB " ,
[ FSCR_TAR_LG ] = " TAR " ,
2017-04-07 11:27:43 +10:00
[ FSCR_MSGP_LG ] = " MSGP " ,
2017-04-07 11:27:44 +10:00
[ FSCR_SCV_LG ] = " SCV " ,
2020-05-06 13:40:41 +10:00
[ FSCR_PREFIX_LG ] = " PREFIX " ,
2013-06-25 17:47:56 +10:00
} ;
2013-08-09 17:29:29 +10:00
char * facility = " unknown " ;
2013-06-25 17:47:56 +10:00
u64 value ;
2015-05-21 12:13:01 +05:30
u32 instword , rd ;
2013-08-09 17:29:29 +10:00
u8 status ;
bool hv ;
2013-06-25 17:47:56 +10:00
2021-04-14 19:00:33 +08:00
hv = ( TRAP ( regs ) = = INTERRUPT_H_FAC_UNAVAIL ) ;
2013-08-09 17:29:29 +10:00
if ( hv )
2013-06-25 17:47:57 +10:00
value = mfspr ( SPRN_HFSCR ) ;
2013-08-09 17:29:29 +10:00
else
value = mfspr ( SPRN_FSCR ) ;
status = value > > 56 ;
2018-03-29 11:53:37 +05:30
if ( ( hv | | status > = 2 ) & &
( status < ARRAY_SIZE ( facility_strings ) ) & &
facility_strings [ status ] )
facility = facility_strings [ status ] ;
/* We should not have taken this interrupt in kernel */
if ( ! user_mode ( regs ) ) {
pr_emerg ( " Facility '%s' unavailable (%d) exception in kernel mode at %lx \n " ,
facility , status , regs - > nip ) ;
die ( " Unexpected facility unavailable exception " , regs , SIGABRT ) ;
}
2021-01-30 23:08:39 +10:00
interrupt_cond_local_irq_enable ( regs ) ;
2018-03-29 11:53:37 +05:30
2013-08-09 17:29:29 +10:00
if ( status = = FSCR_DSCR_LG ) {
2015-05-21 12:13:01 +05:30
/*
* User is accessing the DSCR register using the problem
* state only SPR number ( 0x03 ) either through a mfspr or
* a mtspr instruction . If it is a write attempt through
* a mtspr , then we set the inherit bit . This also allows
* the user to write or read the register directly in the
* future by setting via the FSCR DSCR bit . But in case it
* is a read DSCR attempt through a mfspr instruction , we
* just emulate the instruction instead . This code path will
* always emulate all the mfspr instructions till the user
2016-02-24 10:51:11 -08:00
* has attempted at least one mtspr instruction . This way it
2015-05-21 12:13:01 +05:30
* preserves the same behaviour when the user is accessing
* the DSCR through privilege level only SPR number ( 0x11 )
* which is emulated through illegal instruction exception .
* We always leave HFSCR DSCR set .
2013-08-09 17:29:29 +10:00
*/
2015-05-21 12:13:01 +05:30
if ( get_user ( instword , ( u32 __user * ) ( regs - > nip ) ) ) {
pr_err ( " Failed to fetch the user instruction \n " ) ;
return ;
}
/* Write into DSCR (mtspr 0x03, RS) */
if ( ( instword & PPC_INST_MTSPR_DSCR_USER_MASK )
= = PPC_INST_MTSPR_DSCR_USER ) {
rd = ( instword > > 21 ) & 0x1f ;
current - > thread . dscr = regs - > gpr [ rd ] ;
current - > thread . dscr_inherit = 1 ;
2016-06-09 12:31:08 +10:00
current - > thread . fscr | = FSCR_DSCR ;
mtspr ( SPRN_FSCR , current - > thread . fscr ) ;
2015-05-21 12:13:01 +05:30
}
/* Read from DSCR (mfspr RT, 0x03) */
if ( ( instword & PPC_INST_MFSPR_DSCR_USER_MASK )
= = PPC_INST_MFSPR_DSCR_USER ) {
if ( emulate_instruction ( regs ) ) {
pr_err ( " DSCR based mfspr emulation failed \n " ) ;
return ;
}
2021-06-18 01:51:03 +10:00
regs_add_return_ip ( regs , 4 ) ;
2015-05-21 12:13:01 +05:30
emulate_single_step ( regs ) ;
}
2013-08-09 17:29:29 +10:00
return ;
2013-06-25 17:47:57 +10:00
}
2016-09-14 18:02:15 +10:00
if ( status = = FSCR_TM_LG ) {
/*
* If we ' re here then the hardware is TM aware because it
* generated an exception with FSRM_TM set .
*
* If cpu_has_feature ( CPU_FTR_TM ) is false , then either firmware
* told us not to do TM , or the kernel is not built with TM
* support .
*
* If both of those things are true , then userspace can spam the
* console by triggering the printk ( ) below just by continually
* doing tbegin ( or any TM instruction ) . So in that case just
* send the process a SIGILL immediately .
*/
if ( ! cpu_has_feature ( CPU_FTR_TM ) )
goto out ;
tm_unavailable ( regs ) ;
return ;
}
2016-11-30 17:45:09 +11:00
pr_err_ratelimited ( " %sFacility '%s' unavailable (%d), exception at 0x%lx, MSR=%lx \n " ,
hv ? " Hypervisor " : " " , facility , status , regs - > nip , regs - > msr ) ;
2013-02-13 16:21:38 +00:00
2016-09-14 18:02:15 +10:00
out :
2018-03-29 11:53:37 +05:30
_exception ( SIGILL , regs , ILL_ILLOPC , regs - > nip ) ;
2013-02-13 16:21:38 +00:00
}
2013-08-09 17:29:29 +10:00
# endif
2013-02-13 16:21:38 +00:00
2013-02-13 16:21:39 +00:00
# ifdef CONFIG_PPC_TRANSACTIONAL_MEM
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( fp_unavailable_tm )
2013-02-13 16:21:39 +00:00
{
/* Note: This does not handle any kind of FP laziness. */
TM_DEBUG ( " FP Unavailable trap whilst transactional at 0x%lx, MSR=%lx \n " ,
regs - > nip , regs - > msr ) ;
/* We can only have got here if the task started using FP after
* beginning the transaction . So , the transactional regs are just a
* copy of the checkpointed ones . But , we still need to recheckpoint
* as we ' re enabling FP for the process ; it will return , abort the
* transaction , and probably retry but now with FP enabled . So the
* checkpointed FP registers need to be loaded .
*/
powerpc: Don't corrupt transactional state when using FP/VMX in kernel
Currently, when we have a process using the transactional memory
facilities on POWER8 (that is, the processor is in transactional
or suspended state), and the process enters the kernel and the
kernel then uses the floating-point or vector (VMX/Altivec) facility,
we end up corrupting the user-visible FP/VMX/VSX state. This
happens, for example, if a page fault causes a copy-on-write
operation, because the copy_page function will use VMX to do the
copy on POWER8. The test program below demonstrates the bug.
The bug happens because when FP/VMX state for a transactional process
is stored in the thread_struct, we store the checkpointed state in
.fp_state/.vr_state and the transactional (current) state in
.transact_fp/.transact_vr. However, when the kernel wants to use
FP/VMX, it calls enable_kernel_fp() or enable_kernel_altivec(),
which saves the current state in .fp_state/.vr_state. Furthermore,
when we return to the user process we return with FP/VMX/VSX
disabled. The next time the process uses FP/VMX/VSX, we don't know
which set of state (the current register values, .fp_state/.vr_state,
or .transact_fp/.transact_vr) we should be using, since we have no
way to tell if we are still in the same transaction, and if not,
whether the previous transaction succeeded or failed.
Thus it is necessary to strictly adhere to the rule that if FP has
been enabled at any point in a transaction, we must keep FP enabled
for the user process with the current transactional state in the
FP registers, until we detect that it is no longer in a transaction.
Similarly for VMX; once enabled it must stay enabled until the
process is no longer transactional.
In order to keep this rule, we add a new thread_info flag which we
test when returning from the kernel to userspace, called TIF_RESTORE_TM.
This flag indicates that there is FP/VMX/VSX state to be restored
before entering userspace, and when it is set the .tm_orig_msr field
in the thread_struct indicates what state needs to be restored.
The restoration is done by restore_tm_state(). The TIF_RESTORE_TM
bit is set by new giveup_fpu/altivec_maybe_transactional helpers,
which are called from enable_kernel_fp/altivec, giveup_vsx, and
flush_fp/altivec_to_thread instead of giveup_fpu/altivec.
The other thing to be done is to get the transactional FP/VMX/VSX
state from .fp_state/.vr_state when doing reclaim, if that state
has been saved there by giveup_fpu/altivec_maybe_transactional.
Having done this, we set the FP/VMX bit in the thread's MSR after
reclaim to indicate that that part of the state is now valid
(having been reclaimed from the processor's checkpointed state).
Finally, in the signal handling code, we move the clearing of the
transactional state bits in the thread's MSR a bit earlier, before
calling flush_fp_to_thread(), so that we don't unnecessarily set
the TIF_RESTORE_TM bit.
This is the test program:
/* Michael Neuling 4/12/2013
*
* See if the altivec state is leaked out of an aborted transaction due to
* kernel vmx copy loops.
*
* gcc -m64 htm_vmxcopy.c -o htm_vmxcopy
*
*/
/* We don't use all of these, but for reference: */
int main(int argc, char *argv[])
{
long double vecin = 1.3;
long double vecout;
unsigned long pgsize = getpagesize();
int i;
int fd;
int size = pgsize*16;
char tmpfile[] = "/tmp/page_faultXXXXXX";
char buf[pgsize];
char *a;
uint64_t aborted = 0;
fd = mkstemp(tmpfile);
assert(fd >= 0);
memset(buf, 0, pgsize);
for (i = 0; i < size; i += pgsize)
assert(write(fd, buf, pgsize) == pgsize);
unlink(tmpfile);
a = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
assert(a != MAP_FAILED);
asm __volatile__(
"lxvd2x 40,0,%[vecinptr] ; " // set 40 to initial value
TBEGIN
"beq 3f ;"
TSUSPEND
"xxlxor 40,40,40 ; " // set 40 to 0
"std 5, 0(%[map]) ;" // cause kernel vmx copy page
TABORT
TRESUME
TEND
"li %[res], 0 ;"
"b 5f ;"
"3: ;" // Abort handler
"li %[res], 1 ;"
"5: ;"
"stxvd2x 40,0,%[vecoutptr] ; "
: [res]"=r"(aborted)
: [vecinptr]"r"(&vecin),
[vecoutptr]"r"(&vecout),
[map]"r"(a)
: "memory", "r0", "r3", "r4", "r5", "r6", "r7");
if (aborted && (vecin != vecout)){
printf("FAILED: vector state leaked on abort %f != %f\n",
(double)vecin, (double)vecout);
exit(1);
}
munmap(a, size);
close(fd);
printf("PASSED!\n");
return 0;
}
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-13 15:56:29 +11:00
tm_reclaim_current ( TM_CAUSE_FAC_UNAV ) ;
2018-06-18 19:59:42 -03:00
/*
* Reclaim initially saved out bogus ( lazy ) FPRs to ckfp_state , and
* then it was overwrite by the thr - > fp_state by tm_reclaim_thread ( ) .
*
* At this point , ck { fp , vr } _state contains the exact values we want to
* recheckpoint .
*/
2013-02-13 16:21:39 +00:00
/* Enable FP for the task: */
2017-11-02 14:09:03 +11:00
current - > thread . load_fp = 1 ;
2013-02-13 16:21:39 +00:00
2018-06-18 19:59:42 -03:00
/*
* Recheckpoint all the checkpointed ckpt , ck { fp , vr } _state registers .
2013-02-13 16:21:39 +00:00
*/
2017-11-02 14:09:05 +11:00
tm_recheckpoint ( & current - > thread ) ;
2013-02-13 16:21:39 +00:00
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( altivec_unavailable_tm )
2013-02-13 16:21:39 +00:00
{
/* See the comments in fp_unavailable_tm(). This function operates
* the same way .
*/
TM_DEBUG ( " Vector Unavailable trap whilst transactional at 0x%lx, "
" MSR=%lx \n " ,
regs - > nip , regs - > msr ) ;
powerpc: Don't corrupt transactional state when using FP/VMX in kernel
Currently, when we have a process using the transactional memory
facilities on POWER8 (that is, the processor is in transactional
or suspended state), and the process enters the kernel and the
kernel then uses the floating-point or vector (VMX/Altivec) facility,
we end up corrupting the user-visible FP/VMX/VSX state. This
happens, for example, if a page fault causes a copy-on-write
operation, because the copy_page function will use VMX to do the
copy on POWER8. The test program below demonstrates the bug.
The bug happens because when FP/VMX state for a transactional process
is stored in the thread_struct, we store the checkpointed state in
.fp_state/.vr_state and the transactional (current) state in
.transact_fp/.transact_vr. However, when the kernel wants to use
FP/VMX, it calls enable_kernel_fp() or enable_kernel_altivec(),
which saves the current state in .fp_state/.vr_state. Furthermore,
when we return to the user process we return with FP/VMX/VSX
disabled. The next time the process uses FP/VMX/VSX, we don't know
which set of state (the current register values, .fp_state/.vr_state,
or .transact_fp/.transact_vr) we should be using, since we have no
way to tell if we are still in the same transaction, and if not,
whether the previous transaction succeeded or failed.
Thus it is necessary to strictly adhere to the rule that if FP has
been enabled at any point in a transaction, we must keep FP enabled
for the user process with the current transactional state in the
FP registers, until we detect that it is no longer in a transaction.
Similarly for VMX; once enabled it must stay enabled until the
process is no longer transactional.
In order to keep this rule, we add a new thread_info flag which we
test when returning from the kernel to userspace, called TIF_RESTORE_TM.
This flag indicates that there is FP/VMX/VSX state to be restored
before entering userspace, and when it is set the .tm_orig_msr field
in the thread_struct indicates what state needs to be restored.
The restoration is done by restore_tm_state(). The TIF_RESTORE_TM
bit is set by new giveup_fpu/altivec_maybe_transactional helpers,
which are called from enable_kernel_fp/altivec, giveup_vsx, and
flush_fp/altivec_to_thread instead of giveup_fpu/altivec.
The other thing to be done is to get the transactional FP/VMX/VSX
state from .fp_state/.vr_state when doing reclaim, if that state
has been saved there by giveup_fpu/altivec_maybe_transactional.
Having done this, we set the FP/VMX bit in the thread's MSR after
reclaim to indicate that that part of the state is now valid
(having been reclaimed from the processor's checkpointed state).
Finally, in the signal handling code, we move the clearing of the
transactional state bits in the thread's MSR a bit earlier, before
calling flush_fp_to_thread(), so that we don't unnecessarily set
the TIF_RESTORE_TM bit.
This is the test program:
/* Michael Neuling 4/12/2013
*
* See if the altivec state is leaked out of an aborted transaction due to
* kernel vmx copy loops.
*
* gcc -m64 htm_vmxcopy.c -o htm_vmxcopy
*
*/
/* We don't use all of these, but for reference: */
int main(int argc, char *argv[])
{
long double vecin = 1.3;
long double vecout;
unsigned long pgsize = getpagesize();
int i;
int fd;
int size = pgsize*16;
char tmpfile[] = "/tmp/page_faultXXXXXX";
char buf[pgsize];
char *a;
uint64_t aborted = 0;
fd = mkstemp(tmpfile);
assert(fd >= 0);
memset(buf, 0, pgsize);
for (i = 0; i < size; i += pgsize)
assert(write(fd, buf, pgsize) == pgsize);
unlink(tmpfile);
a = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
assert(a != MAP_FAILED);
asm __volatile__(
"lxvd2x 40,0,%[vecinptr] ; " // set 40 to initial value
TBEGIN
"beq 3f ;"
TSUSPEND
"xxlxor 40,40,40 ; " // set 40 to 0
"std 5, 0(%[map]) ;" // cause kernel vmx copy page
TABORT
TRESUME
TEND
"li %[res], 0 ;"
"b 5f ;"
"3: ;" // Abort handler
"li %[res], 1 ;"
"5: ;"
"stxvd2x 40,0,%[vecoutptr] ; "
: [res]"=r"(aborted)
: [vecinptr]"r"(&vecin),
[vecoutptr]"r"(&vecout),
[map]"r"(a)
: "memory", "r0", "r3", "r4", "r5", "r6", "r7");
if (aborted && (vecin != vecout)){
printf("FAILED: vector state leaked on abort %f != %f\n",
(double)vecin, (double)vecout);
exit(1);
}
munmap(a, size);
close(fd);
printf("PASSED!\n");
return 0;
}
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-13 15:56:29 +11:00
tm_reclaim_current ( TM_CAUSE_FAC_UNAV ) ;
2017-11-02 14:09:03 +11:00
current - > thread . load_vec = 1 ;
2017-11-02 14:09:05 +11:00
tm_recheckpoint ( & current - > thread ) ;
2013-02-13 16:21:39 +00:00
current - > thread . used_vr = 1 ;
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( vsx_unavailable_tm )
2013-02-13 16:21:39 +00:00
{
/* See the comments in fp_unavailable_tm(). This works similarly,
* though we ' re loading both FP and VEC registers in here .
*
* If FP isn ' t in use , load FP regs . If VEC isn ' t in use , load VEC
* regs . Either way , set MSR_VSX .
*/
TM_DEBUG ( " VSX Unavailable trap whilst transactional at 0x%lx, "
" MSR=%lx \n " ,
regs - > nip , regs - > msr ) ;
2014-01-13 15:56:30 +11:00
current - > thread . used_vsr = 1 ;
2013-02-13 16:21:39 +00:00
/* This reclaims FP and/or VR regs if they're already enabled */
powerpc: Don't corrupt transactional state when using FP/VMX in kernel
Currently, when we have a process using the transactional memory
facilities on POWER8 (that is, the processor is in transactional
or suspended state), and the process enters the kernel and the
kernel then uses the floating-point or vector (VMX/Altivec) facility,
we end up corrupting the user-visible FP/VMX/VSX state. This
happens, for example, if a page fault causes a copy-on-write
operation, because the copy_page function will use VMX to do the
copy on POWER8. The test program below demonstrates the bug.
The bug happens because when FP/VMX state for a transactional process
is stored in the thread_struct, we store the checkpointed state in
.fp_state/.vr_state and the transactional (current) state in
.transact_fp/.transact_vr. However, when the kernel wants to use
FP/VMX, it calls enable_kernel_fp() or enable_kernel_altivec(),
which saves the current state in .fp_state/.vr_state. Furthermore,
when we return to the user process we return with FP/VMX/VSX
disabled. The next time the process uses FP/VMX/VSX, we don't know
which set of state (the current register values, .fp_state/.vr_state,
or .transact_fp/.transact_vr) we should be using, since we have no
way to tell if we are still in the same transaction, and if not,
whether the previous transaction succeeded or failed.
Thus it is necessary to strictly adhere to the rule that if FP has
been enabled at any point in a transaction, we must keep FP enabled
for the user process with the current transactional state in the
FP registers, until we detect that it is no longer in a transaction.
Similarly for VMX; once enabled it must stay enabled until the
process is no longer transactional.
In order to keep this rule, we add a new thread_info flag which we
test when returning from the kernel to userspace, called TIF_RESTORE_TM.
This flag indicates that there is FP/VMX/VSX state to be restored
before entering userspace, and when it is set the .tm_orig_msr field
in the thread_struct indicates what state needs to be restored.
The restoration is done by restore_tm_state(). The TIF_RESTORE_TM
bit is set by new giveup_fpu/altivec_maybe_transactional helpers,
which are called from enable_kernel_fp/altivec, giveup_vsx, and
flush_fp/altivec_to_thread instead of giveup_fpu/altivec.
The other thing to be done is to get the transactional FP/VMX/VSX
state from .fp_state/.vr_state when doing reclaim, if that state
has been saved there by giveup_fpu/altivec_maybe_transactional.
Having done this, we set the FP/VMX bit in the thread's MSR after
reclaim to indicate that that part of the state is now valid
(having been reclaimed from the processor's checkpointed state).
Finally, in the signal handling code, we move the clearing of the
transactional state bits in the thread's MSR a bit earlier, before
calling flush_fp_to_thread(), so that we don't unnecessarily set
the TIF_RESTORE_TM bit.
This is the test program:
/* Michael Neuling 4/12/2013
*
* See if the altivec state is leaked out of an aborted transaction due to
* kernel vmx copy loops.
*
* gcc -m64 htm_vmxcopy.c -o htm_vmxcopy
*
*/
/* We don't use all of these, but for reference: */
int main(int argc, char *argv[])
{
long double vecin = 1.3;
long double vecout;
unsigned long pgsize = getpagesize();
int i;
int fd;
int size = pgsize*16;
char tmpfile[] = "/tmp/page_faultXXXXXX";
char buf[pgsize];
char *a;
uint64_t aborted = 0;
fd = mkstemp(tmpfile);
assert(fd >= 0);
memset(buf, 0, pgsize);
for (i = 0; i < size; i += pgsize)
assert(write(fd, buf, pgsize) == pgsize);
unlink(tmpfile);
a = mmap(NULL, size, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0);
assert(a != MAP_FAILED);
asm __volatile__(
"lxvd2x 40,0,%[vecinptr] ; " // set 40 to initial value
TBEGIN
"beq 3f ;"
TSUSPEND
"xxlxor 40,40,40 ; " // set 40 to 0
"std 5, 0(%[map]) ;" // cause kernel vmx copy page
TABORT
TRESUME
TEND
"li %[res], 0 ;"
"b 5f ;"
"3: ;" // Abort handler
"li %[res], 1 ;"
"5: ;"
"stxvd2x 40,0,%[vecoutptr] ; "
: [res]"=r"(aborted)
: [vecinptr]"r"(&vecin),
[vecoutptr]"r"(&vecout),
[map]"r"(a)
: "memory", "r0", "r3", "r4", "r5", "r6", "r7");
if (aborted && (vecin != vecout)){
printf("FAILED: vector state leaked on abort %f != %f\n",
(double)vecin, (double)vecout);
exit(1);
}
munmap(a, size);
close(fd);
printf("PASSED!\n");
return 0;
}
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-01-13 15:56:29 +11:00
tm_reclaim_current ( TM_CAUSE_FAC_UNAV ) ;
2013-02-13 16:21:39 +00:00
2017-11-02 14:09:03 +11:00
current - > thread . load_vec = 1 ;
current - > thread . load_fp = 1 ;
2014-01-13 15:56:30 +11:00
2017-11-02 14:09:05 +11:00
tm_recheckpoint ( & current - > thread ) ;
2013-02-13 16:21:39 +00:00
}
# endif /* CONFIG_PPC_TRANSACTIONAL_MEM */
2021-01-30 23:08:38 +10:00
# ifdef CONFIG_PPC64
DECLARE_INTERRUPT_HANDLER_NMI ( performance_monitor_exception_nmi ) ;
DEFINE_INTERRUPT_HANDLER_NMI ( performance_monitor_exception_nmi )
2021-01-30 23:08:29 +10:00
{
__this_cpu_inc ( irq_stat . pmu_irqs ) ;
perf_irq ( regs ) ;
2021-01-30 23:08:38 +10:00
return 0 ;
2021-01-30 23:08:29 +10:00
}
2021-01-30 23:08:38 +10:00
# endif
2021-01-30 23:08:29 +10:00
2021-01-30 23:08:38 +10:00
DECLARE_INTERRUPT_HANDLER_ASYNC ( performance_monitor_exception_async ) ;
DEFINE_INTERRUPT_HANDLER_ASYNC ( performance_monitor_exception_async )
2005-10-01 18:43:42 +10:00
{
powerpc: Replace __get_cpu_var uses
This still has not been merged and now powerpc is the only arch that does
not have this change. Sorry about missing linuxppc-dev before.
V2->V2
- Fix up to work against 3.18-rc1
__get_cpu_var() is used for multiple purposes in the kernel source. One of
them is address calculation via the form &__get_cpu_var(x). This calculates
the address for the instance of the percpu variable of the current processor
based on an offset.
Other use cases are for storing and retrieving data from the current
processors percpu area. __get_cpu_var() can be used as an lvalue when
writing data or on the right side of an assignment.
__get_cpu_var() is defined as :
__get_cpu_var() always only does an address determination. However, store
and retrieve operations could use a segment prefix (or global register on
other platforms) to avoid the address calculation.
this_cpu_write() and this_cpu_read() can directly take an offset into a
percpu area and use optimized assembly code to read and write per cpu
variables.
This patch converts __get_cpu_var into either an explicit address
calculation using this_cpu_ptr() or into a use of this_cpu operations that
use the offset. Thereby address calculations are avoided and less registers
are used when code is generated.
At the end of the patch set all uses of __get_cpu_var have been removed so
the macro is removed too.
The patch set includes passes over all arches as well. Once these operations
are used throughout then specialized macros can be defined in non -x86
arches as well in order to optimize per cpu access by f.e. using a global
register that may be set to the per cpu base.
Transformations done to __get_cpu_var()
1. Determine the address of the percpu instance of the current processor.
DEFINE_PER_CPU(int, y);
int *x = &__get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(&y);
2. Same as #1 but this time an array structure is involved.
DEFINE_PER_CPU(int, y[20]);
int *x = __get_cpu_var(y);
Converts to
int *x = this_cpu_ptr(y);
3. Retrieve the content of the current processors instance of a per cpu
variable.
DEFINE_PER_CPU(int, y);
int x = __get_cpu_var(y)
Converts to
int x = __this_cpu_read(y);
4. Retrieve the content of a percpu struct
DEFINE_PER_CPU(struct mystruct, y);
struct mystruct x = __get_cpu_var(y);
Converts to
memcpy(&x, this_cpu_ptr(&y), sizeof(x));
5. Assignment to a per cpu variable
DEFINE_PER_CPU(int, y)
__get_cpu_var(y) = x;
Converts to
__this_cpu_write(y, x);
6. Increment/Decrement etc of a per cpu variable
DEFINE_PER_CPU(int, y);
__get_cpu_var(y)++
Converts to
__this_cpu_inc(y)
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
CC: Paul Mackerras <paulus@samba.org>
Signed-off-by: Christoph Lameter <cl@linux.com>
[mpe: Fix build errors caused by set/or_softirq_pending(), and rework
assignment in __set_breakpoint() to use memcpy().]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2014-10-21 15:23:25 -05:00
__this_cpu_inc ( irq_stat . pmu_irqs ) ;
2010-01-31 20:34:06 +00:00
2005-10-01 18:43:42 +10:00
perf_irq ( regs ) ;
2021-01-30 23:08:29 +10:00
}
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER_RAW ( performance_monitor_exception )
2021-01-30 23:08:29 +10:00
{
/*
* On 64 - bit , if perf interrupts hit in a local_irq_disable
* ( soft - masked ) region , we consider them as NMIs . This is required to
* prevent hash faults on user addresses when reading callchains ( and
* looks better from an irq tracing perspective ) .
*/
if ( IS_ENABLED ( CONFIG_PPC64 ) & & unlikely ( arch_irq_disabled_regs ( regs ) ) )
performance_monitor_exception_nmi ( regs ) ;
else
performance_monitor_exception_async ( regs ) ;
2021-01-30 23:08:38 +10:00
return 0 ;
2005-10-01 18:43:42 +10:00
}
2010-02-08 11:50:57 +00:00
# ifdef CONFIG_PPC_ADV_DEBUG_REGS
2010-02-08 11:51:18 +00:00
static void handle_debug ( struct pt_regs * regs , unsigned long debug_status )
{
int changed = 0 ;
/*
* Determine the cause of the debug event , clear the
* event flags and send a trap to the handler . Torez
*/
if ( debug_status & ( DBSR_DAC1R | DBSR_DAC1W ) ) {
dbcr_dac ( current ) & = ~ ( DBCR_DAC1R | DBCR_DAC1W ) ;
# ifdef CONFIG_PPC_ADV_DEBUG_DAC_RANGE
2013-07-04 11:45:46 +05:30
current - > thread . debug . dbcr2 & = ~ DBCR2_DAC12MODE ;
2010-02-08 11:51:18 +00:00
# endif
2018-01-16 16:12:38 -06:00
do_send_trap ( regs , mfspr ( SPRN_DAC1 ) , debug_status ,
2010-02-08 11:51:18 +00:00
5 ) ;
changed | = 0x01 ;
} else if ( debug_status & ( DBSR_DAC2R | DBSR_DAC2W ) ) {
dbcr_dac ( current ) & = ~ ( DBCR_DAC2R | DBCR_DAC2W ) ;
2018-01-16 16:12:38 -06:00
do_send_trap ( regs , mfspr ( SPRN_DAC2 ) , debug_status ,
2010-02-08 11:51:18 +00:00
6 ) ;
changed | = 0x01 ;
} else if ( debug_status & DBSR_IAC1 ) {
2013-07-04 11:45:46 +05:30
current - > thread . debug . dbcr0 & = ~ DBCR0_IAC1 ;
2010-02-08 11:51:18 +00:00
dbcr_iac_range ( current ) & = ~ DBCR_IAC12MODE ;
2018-01-16 16:12:38 -06:00
do_send_trap ( regs , mfspr ( SPRN_IAC1 ) , debug_status ,
2010-02-08 11:51:18 +00:00
1 ) ;
changed | = 0x01 ;
} else if ( debug_status & DBSR_IAC2 ) {
2013-07-04 11:45:46 +05:30
current - > thread . debug . dbcr0 & = ~ DBCR0_IAC2 ;
2018-01-16 16:12:38 -06:00
do_send_trap ( regs , mfspr ( SPRN_IAC2 ) , debug_status ,
2010-02-08 11:51:18 +00:00
2 ) ;
changed | = 0x01 ;
} else if ( debug_status & DBSR_IAC3 ) {
2013-07-04 11:45:46 +05:30
current - > thread . debug . dbcr0 & = ~ DBCR0_IAC3 ;
2010-02-08 11:51:18 +00:00
dbcr_iac_range ( current ) & = ~ DBCR_IAC34MODE ;
2018-01-16 16:12:38 -06:00
do_send_trap ( regs , mfspr ( SPRN_IAC3 ) , debug_status ,
2010-02-08 11:51:18 +00:00
3 ) ;
changed | = 0x01 ;
} else if ( debug_status & DBSR_IAC4 ) {
2013-07-04 11:45:46 +05:30
current - > thread . debug . dbcr0 & = ~ DBCR0_IAC4 ;
2018-01-16 16:12:38 -06:00
do_send_trap ( regs , mfspr ( SPRN_IAC4 ) , debug_status ,
2010-02-08 11:51:18 +00:00
4 ) ;
changed | = 0x01 ;
}
/*
* At the point this routine was called , the MSR ( DE ) was turned off .
* Check all other debug flags and see if that bit needs to be turned
* back on or not .
*/
2013-07-04 11:45:46 +05:30
if ( DBCR_ACTIVE_EVENTS ( current - > thread . debug . dbcr0 ,
2013-06-26 11:12:22 +05:30
current - > thread . debug . dbcr1 ) )
2021-06-18 01:51:03 +10:00
regs_set_return_msr ( regs , regs - > msr | MSR_DE ) ;
2010-02-08 11:51:18 +00:00
else
/* Make sure the IDM flag is off */
2013-07-04 11:45:46 +05:30
current - > thread . debug . dbcr0 & = ~ DBCR0_IDM ;
2010-02-08 11:51:18 +00:00
if ( changed & 0x01 )
2013-07-04 11:45:46 +05:30
mtspr ( SPRN_DBCR0 , current - > thread . debug . dbcr0 ) ;
2010-02-08 11:51:18 +00:00
}
2005-09-26 16:04:21 +10:00
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( DebugException )
2005-09-26 16:04:21 +10:00
{
2021-01-30 23:08:19 +10:00
unsigned long debug_status = regs - > dsisr ;
2013-07-04 11:45:46 +05:30
current - > thread . debug . dbsr = debug_status ;
2010-02-08 11:51:18 +00:00
2009-05-28 21:26:38 +00:00
/* Hack alert: On BookE, Branch Taken stops on the branch itself, while
* on server , it stops on the target of the branch . In order to simulate
* the server behaviour , we thus restart right away with a single step
* instead of stopping here when hitting a BT
*/
if ( debug_status & DBSR_BT ) {
2021-06-18 01:51:03 +10:00
regs_set_return_msr ( regs , regs - > msr & ~ MSR_DE ) ;
2009-05-28 21:26:38 +00:00
/* Disable BT */
mtspr ( SPRN_DBCR0 , mfspr ( SPRN_DBCR0 ) & ~ DBCR0_BT ) ;
/* Clear the BT event */
mtspr ( SPRN_DBSR , DBSR_BT ) ;
/* Do the single step trick only when coming from userspace */
if ( user_mode ( regs ) ) {
2013-07-04 11:45:46 +05:30
current - > thread . debug . dbcr0 & = ~ DBCR0_BT ;
current - > thread . debug . dbcr0 | = DBCR0_IDM | DBCR0_IC ;
2021-06-18 01:51:03 +10:00
regs_set_return_msr ( regs , regs - > msr | MSR_DE ) ;
2009-05-28 21:26:38 +00:00
return ;
}
2016-11-21 22:36:41 +05:30
if ( kprobe_post_handler ( regs ) )
return ;
2009-05-28 21:26:38 +00:00
if ( notify_die ( DIE_SSTEP , " block_step " , regs , 5 ,
5 , SIGTRAP ) = = NOTIFY_STOP ) {
return ;
}
if ( debugger_sstep ( regs ) )
return ;
} else if ( debug_status & DBSR_IC ) { /* Instruction complete */
2021-06-18 01:51:03 +10:00
regs_set_return_msr ( regs , regs - > msr & ~ MSR_DE ) ;
2008-06-26 02:01:37 -05:00
/* Disable instruction completion */
mtspr ( SPRN_DBCR0 , mfspr ( SPRN_DBCR0 ) & ~ DBCR0_IC ) ;
/* Clear the instruction completion event */
mtspr ( SPRN_DBSR , DBSR_IC ) ;
2016-11-21 22:36:41 +05:30
if ( kprobe_post_handler ( regs ) )
return ;
2008-06-26 02:01:37 -05:00
if ( notify_die ( DIE_SSTEP , " single_step " , regs , 5 ,
5 , SIGTRAP ) = = NOTIFY_STOP ) {
return ;
}
if ( debugger_sstep ( regs ) )
return ;
2008-07-24 02:10:41 +10:00
if ( user_mode ( regs ) ) {
2013-07-04 11:45:46 +05:30
current - > thread . debug . dbcr0 & = ~ DBCR0_IC ;
if ( DBCR_ACTIVE_EVENTS ( current - > thread . debug . dbcr0 ,
current - > thread . debug . dbcr1 ) )
2021-06-18 01:51:03 +10:00
regs_set_return_msr ( regs , regs - > msr | MSR_DE ) ;
2010-02-08 11:51:18 +00:00
else
/* Make sure the IDM bit is off */
2013-07-04 11:45:46 +05:30
current - > thread . debug . dbcr0 & = ~ DBCR0_IDM ;
2008-07-24 02:10:41 +10:00
}
2010-02-08 11:51:18 +00:00
_exception ( SIGTRAP , regs , TRAP_TRACE , regs - > nip ) ;
} else
handle_debug ( regs , debug_status ) ;
2005-09-26 16:04:21 +10:00
}
2010-02-08 11:50:57 +00:00
# endif /* CONFIG_PPC_ADV_DEBUG_REGS */
2005-09-26 16:04:21 +10:00
# ifdef CONFIG_ALTIVEC
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( altivec_assist_exception )
2005-09-26 16:04:21 +10:00
{
int err ;
if ( ! user_mode ( regs ) ) {
printk ( KERN_EMERG " VMX/Altivec assist exception in kernel mode "
" at %lx \n " , regs - > nip ) ;
2005-10-06 13:27:05 +10:00
die ( " Kernel VMX/Altivec assist exception " , regs , SIGILL ) ;
2005-09-26 16:04:21 +10:00
}
2005-10-01 18:43:42 +10:00
flush_altivec_to_thread ( current ) ;
2009-10-27 18:46:55 +00:00
PPC_WARN_EMULATED ( altivec , regs ) ;
2005-09-26 16:04:21 +10:00
err = emulate_altivec ( regs ) ;
if ( err = = 0 ) {
2021-06-18 01:51:03 +10:00
regs_add_return_ip ( regs , 4 ) ; /* skip emulated instruction */
2005-09-26 16:04:21 +10:00
emulate_single_step ( regs ) ;
return ;
}
if ( err = = - EFAULT ) {
/* got an error reading the instruction */
_exception ( SIGSEGV , regs , SEGV_ACCERR , regs - > nip ) ;
} else {
/* didn't recognize the instruction */
/* XXX quick hack for now: set the non-Java bit in the VSCR */
2011-06-04 05:36:54 +00:00
printk_ratelimited ( KERN_ERR " Unrecognized altivec instruction "
" in %s at %lx \n " , current - > comm , regs - > nip ) ;
2013-09-10 20:20:42 +10:00
current - > thread . vr_state . vscr . u [ 3 ] | = 0x10000 ;
2005-09-26 16:04:21 +10:00
}
}
# endif /* CONFIG_ALTIVEC */
# ifdef CONFIG_FSL_BOOKE
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( CacheLockingException )
2005-09-26 16:04:21 +10:00
{
2021-01-30 23:08:17 +10:00
unsigned long error_code = regs - > dsisr ;
2005-09-26 16:04:21 +10:00
/* We treat cache locking instructions from the user
* as priv ops , in the future we could try to do
* something smarter
*/
if ( error_code & ( ESR_DLK | ESR_ILK ) )
_exception ( SIGILL , regs , ILL_PRVOPC , regs - > nip ) ;
return ;
}
# endif /* CONFIG_FSL_BOOKE */
# ifdef CONFIG_SPE
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( SPEFloatingPointException )
2005-09-26 16:04:21 +10:00
{
2008-10-28 11:50:21 +08:00
extern int do_spe_mathemu ( struct pt_regs * regs ) ;
2005-09-26 16:04:21 +10:00
unsigned long spefscr ;
int fpexc_mode ;
2018-04-17 15:30:54 -05:00
int code = FPE_FLTUNK ;
2008-10-28 11:50:21 +08:00
int err ;
2021-01-30 23:08:39 +10:00
interrupt_cond_local_irq_enable ( regs ) ;
2019-04-30 12:38:57 +00:00
2011-06-14 18:34:25 -05:00
flush_spe_to_thread ( current ) ;
2005-09-26 16:04:21 +10:00
spefscr = current - > thread . spefscr ;
fpexc_mode = current - > thread . fpexc_mode ;
if ( ( spefscr & SPEFSCR_FOVF ) & & ( fpexc_mode & PR_FP_EXC_OVF ) ) {
code = FPE_FLTOVF ;
}
else if ( ( spefscr & SPEFSCR_FUNF ) & & ( fpexc_mode & PR_FP_EXC_UND ) ) {
code = FPE_FLTUND ;
}
else if ( ( spefscr & SPEFSCR_FDBZ ) & & ( fpexc_mode & PR_FP_EXC_DIV ) )
code = FPE_FLTDIV ;
else if ( ( spefscr & SPEFSCR_FINV ) & & ( fpexc_mode & PR_FP_EXC_INV ) ) {
code = FPE_FLTINV ;
}
else if ( ( spefscr & ( SPEFSCR_FG | SPEFSCR_FX ) ) & & ( fpexc_mode & PR_FP_EXC_RES ) )
code = FPE_FLTRES ;
2008-10-28 11:50:21 +08:00
err = do_spe_mathemu ( regs ) ;
if ( err = = 0 ) {
2021-06-18 01:51:03 +10:00
regs_add_return_ip ( regs , 4 ) ; /* skip emulated instruction */
2008-10-28 11:50:21 +08:00
emulate_single_step ( regs ) ;
return ;
}
if ( err = = - EFAULT ) {
/* got an error reading the instruction */
_exception ( SIGSEGV , regs , SEGV_ACCERR , regs - > nip ) ;
} else if ( err = = - EINVAL ) {
/* didn't recognize the instruction */
printk ( KERN_ERR " unrecognized spe instruction "
" in %s at %lx \n " , current - > comm , regs - > nip ) ;
} else {
_exception ( SIGFPE , regs , code , regs - > nip ) ;
}
2005-09-26 16:04:21 +10:00
return ;
}
2008-10-28 11:50:21 +08:00
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( SPEFloatingPointRoundException )
2008-10-28 11:50:21 +08:00
{
extern int speround_handler ( struct pt_regs * regs ) ;
int err ;
2021-01-30 23:08:39 +10:00
interrupt_cond_local_irq_enable ( regs ) ;
2019-04-30 12:38:57 +00:00
2008-10-28 11:50:21 +08:00
preempt_disable ( ) ;
if ( regs - > msr & MSR_SPE )
giveup_spe ( current ) ;
preempt_enable ( ) ;
2021-06-18 01:51:03 +10:00
regs_add_return_ip ( regs , - 4 ) ;
2008-10-28 11:50:21 +08:00
err = speround_handler ( regs ) ;
if ( err = = 0 ) {
2021-06-18 01:51:03 +10:00
regs_add_return_ip ( regs , 4 ) ; /* skip emulated instruction */
2008-10-28 11:50:21 +08:00
emulate_single_step ( regs ) ;
return ;
}
if ( err = = - EFAULT ) {
/* got an error reading the instruction */
_exception ( SIGSEGV , regs , SEGV_ACCERR , regs - > nip ) ;
} else if ( err = = - EINVAL ) {
/* didn't recognize the instruction */
printk ( KERN_ERR " unrecognized spe instruction "
" in %s at %lx \n " , current - > comm , regs - > nip ) ;
} else {
2018-04-17 15:30:54 -05:00
_exception ( SIGFPE , regs , FPE_FLTUNK , regs - > nip ) ;
2008-10-28 11:50:21 +08:00
return ;
}
}
2005-09-26 16:04:21 +10:00
# endif
2005-10-01 18:43:42 +10:00
/*
* We enter here if we get an unrecoverable exception , that is , one
* that happened at a point where the RI ( recoverable interrupt ) bit
* in the MSR is 0. This indicates that SRR0 / 1 are live , and that
* we therefore lost state by taking this exception .
*/
powerpc/traps: Declare unrecoverable_exception() as __noreturn
unrecoverable_exception() is never expected to return, most callers
have an infiniteloop in case it returns.
Ensure it really never returns by terminating it with a BUG(), and
declare it __no_return.
It always GCC to really simplify functions calling it. In the exemple
below, it avoids the stack frame in the likely fast path and avoids
code duplication for the exit.
With this patch:
00000348 <interrupt_exit_kernel_prepare>:
348: 81 43 00 84 lwz r10,132(r3)
34c: 71 48 00 02 andi. r8,r10,2
350: 41 82 00 2c beq 37c <interrupt_exit_kernel_prepare+0x34>
354: 71 4a 40 00 andi. r10,r10,16384
358: 40 82 00 20 bne 378 <interrupt_exit_kernel_prepare+0x30>
35c: 80 62 00 70 lwz r3,112(r2)
360: 74 63 00 01 andis. r3,r3,1
364: 40 82 00 28 bne 38c <interrupt_exit_kernel_prepare+0x44>
368: 7d 40 00 a6 mfmsr r10
36c: 7c 11 13 a6 mtspr 81,r0
370: 7c 12 13 a6 mtspr 82,r0
374: 4e 80 00 20 blr
378: 48 00 00 00 b 378 <interrupt_exit_kernel_prepare+0x30>
37c: 94 21 ff f0 stwu r1,-16(r1)
380: 7c 08 02 a6 mflr r0
384: 90 01 00 14 stw r0,20(r1)
388: 48 00 00 01 bl 388 <interrupt_exit_kernel_prepare+0x40>
388: R_PPC_REL24 unrecoverable_exception
38c: 38 e2 00 70 addi r7,r2,112
390: 3d 00 00 01 lis r8,1
394: 7c c0 38 28 lwarx r6,0,r7
398: 7c c6 40 78 andc r6,r6,r8
39c: 7c c0 39 2d stwcx. r6,0,r7
3a0: 40 a2 ff f4 bne 394 <interrupt_exit_kernel_prepare+0x4c>
3a4: 38 60 00 01 li r3,1
3a8: 4b ff ff c0 b 368 <interrupt_exit_kernel_prepare+0x20>
Without this patch:
00000348 <interrupt_exit_kernel_prepare>:
348: 94 21 ff f0 stwu r1,-16(r1)
34c: 93 e1 00 0c stw r31,12(r1)
350: 7c 7f 1b 78 mr r31,r3
354: 81 23 00 84 lwz r9,132(r3)
358: 71 2a 00 02 andi. r10,r9,2
35c: 41 82 00 34 beq 390 <interrupt_exit_kernel_prepare+0x48>
360: 71 29 40 00 andi. r9,r9,16384
364: 40 82 00 28 bne 38c <interrupt_exit_kernel_prepare+0x44>
368: 80 62 00 70 lwz r3,112(r2)
36c: 74 63 00 01 andis. r3,r3,1
370: 40 82 00 3c bne 3ac <interrupt_exit_kernel_prepare+0x64>
374: 7d 20 00 a6 mfmsr r9
378: 7c 11 13 a6 mtspr 81,r0
37c: 7c 12 13 a6 mtspr 82,r0
380: 83 e1 00 0c lwz r31,12(r1)
384: 38 21 00 10 addi r1,r1,16
388: 4e 80 00 20 blr
38c: 48 00 00 00 b 38c <interrupt_exit_kernel_prepare+0x44>
390: 7c 08 02 a6 mflr r0
394: 90 01 00 14 stw r0,20(r1)
398: 48 00 00 01 bl 398 <interrupt_exit_kernel_prepare+0x50>
398: R_PPC_REL24 unrecoverable_exception
39c: 80 01 00 14 lwz r0,20(r1)
3a0: 81 3f 00 84 lwz r9,132(r31)
3a4: 7c 08 03 a6 mtlr r0
3a8: 4b ff ff b8 b 360 <interrupt_exit_kernel_prepare+0x18>
3ac: 39 02 00 70 addi r8,r2,112
3b0: 3d 40 00 01 lis r10,1
3b4: 7c e0 40 28 lwarx r7,0,r8
3b8: 7c e7 50 78 andc r7,r7,r10
3bc: 7c e0 41 2d stwcx. r7,0,r8
3c0: 40 a2 ff f4 bne 3b4 <interrupt_exit_kernel_prepare+0x6c>
3c4: 38 60 00 01 li r3,1
3c8: 4b ff ff ac b 374 <interrupt_exit_kernel_prepare+0x2c>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1e883e9d93fdb256853d1434c8ad77c257349b2d.1615552866.git.christophe.leroy@csgroup.eu
2021-03-12 12:50:10 +00:00
void __noreturn unrecoverable_exception ( struct pt_regs * regs )
2005-10-01 18:43:42 +10:00
{
2018-09-25 14:10:04 +00:00
pr_emerg ( " Unrecoverable exception %lx at %lx (msr=%lx) \n " ,
regs - > trap , regs - > nip , regs - > msr ) ;
2005-10-01 18:43:42 +10:00
die ( " Unrecoverable exception " , regs , SIGABRT ) ;
powerpc/traps: Declare unrecoverable_exception() as __noreturn
unrecoverable_exception() is never expected to return, most callers
have an infiniteloop in case it returns.
Ensure it really never returns by terminating it with a BUG(), and
declare it __no_return.
It always GCC to really simplify functions calling it. In the exemple
below, it avoids the stack frame in the likely fast path and avoids
code duplication for the exit.
With this patch:
00000348 <interrupt_exit_kernel_prepare>:
348: 81 43 00 84 lwz r10,132(r3)
34c: 71 48 00 02 andi. r8,r10,2
350: 41 82 00 2c beq 37c <interrupt_exit_kernel_prepare+0x34>
354: 71 4a 40 00 andi. r10,r10,16384
358: 40 82 00 20 bne 378 <interrupt_exit_kernel_prepare+0x30>
35c: 80 62 00 70 lwz r3,112(r2)
360: 74 63 00 01 andis. r3,r3,1
364: 40 82 00 28 bne 38c <interrupt_exit_kernel_prepare+0x44>
368: 7d 40 00 a6 mfmsr r10
36c: 7c 11 13 a6 mtspr 81,r0
370: 7c 12 13 a6 mtspr 82,r0
374: 4e 80 00 20 blr
378: 48 00 00 00 b 378 <interrupt_exit_kernel_prepare+0x30>
37c: 94 21 ff f0 stwu r1,-16(r1)
380: 7c 08 02 a6 mflr r0
384: 90 01 00 14 stw r0,20(r1)
388: 48 00 00 01 bl 388 <interrupt_exit_kernel_prepare+0x40>
388: R_PPC_REL24 unrecoverable_exception
38c: 38 e2 00 70 addi r7,r2,112
390: 3d 00 00 01 lis r8,1
394: 7c c0 38 28 lwarx r6,0,r7
398: 7c c6 40 78 andc r6,r6,r8
39c: 7c c0 39 2d stwcx. r6,0,r7
3a0: 40 a2 ff f4 bne 394 <interrupt_exit_kernel_prepare+0x4c>
3a4: 38 60 00 01 li r3,1
3a8: 4b ff ff c0 b 368 <interrupt_exit_kernel_prepare+0x20>
Without this patch:
00000348 <interrupt_exit_kernel_prepare>:
348: 94 21 ff f0 stwu r1,-16(r1)
34c: 93 e1 00 0c stw r31,12(r1)
350: 7c 7f 1b 78 mr r31,r3
354: 81 23 00 84 lwz r9,132(r3)
358: 71 2a 00 02 andi. r10,r9,2
35c: 41 82 00 34 beq 390 <interrupt_exit_kernel_prepare+0x48>
360: 71 29 40 00 andi. r9,r9,16384
364: 40 82 00 28 bne 38c <interrupt_exit_kernel_prepare+0x44>
368: 80 62 00 70 lwz r3,112(r2)
36c: 74 63 00 01 andis. r3,r3,1
370: 40 82 00 3c bne 3ac <interrupt_exit_kernel_prepare+0x64>
374: 7d 20 00 a6 mfmsr r9
378: 7c 11 13 a6 mtspr 81,r0
37c: 7c 12 13 a6 mtspr 82,r0
380: 83 e1 00 0c lwz r31,12(r1)
384: 38 21 00 10 addi r1,r1,16
388: 4e 80 00 20 blr
38c: 48 00 00 00 b 38c <interrupt_exit_kernel_prepare+0x44>
390: 7c 08 02 a6 mflr r0
394: 90 01 00 14 stw r0,20(r1)
398: 48 00 00 01 bl 398 <interrupt_exit_kernel_prepare+0x50>
398: R_PPC_REL24 unrecoverable_exception
39c: 80 01 00 14 lwz r0,20(r1)
3a0: 81 3f 00 84 lwz r9,132(r31)
3a4: 7c 08 03 a6 mtlr r0
3a8: 4b ff ff b8 b 360 <interrupt_exit_kernel_prepare+0x18>
3ac: 39 02 00 70 addi r8,r2,112
3b0: 3d 40 00 01 lis r10,1
3b4: 7c e0 40 28 lwarx r7,0,r8
3b8: 7c e7 50 78 andc r7,r7,r10
3bc: 7c e0 41 2d stwcx. r7,0,r8
3c0: 40 a2 ff f4 bne 3b4 <interrupt_exit_kernel_prepare+0x6c>
3c4: 38 60 00 01 li r3,1
3c8: 4b ff ff ac b 374 <interrupt_exit_kernel_prepare+0x2c>
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1e883e9d93fdb256853d1434c8ad77c257349b2d.1615552866.git.christophe.leroy@csgroup.eu
2021-03-12 12:50:10 +00:00
/* die() should not return */
for ( ; ; )
;
2005-10-01 18:43:42 +10:00
}
2012-10-05 08:07:15 +00:00
# if defined(CONFIG_BOOKE_WDT) || defined(CONFIG_40x)
2005-09-26 16:04:21 +10:00
/*
* Default handler for a Watchdog exception ,
* spins until a reboot occurs
*/
void __attribute__ ( ( weak ) ) WatchdogHandler ( struct pt_regs * regs )
{
/* Generic WatchdogHandler, implement your own */
mtspr ( SPRN_TCR , mfspr ( SPRN_TCR ) & ( ~ TCR_WIE ) ) ;
return ;
}
2021-03-16 20:41:59 +10:00
DEFINE_INTERRUPT_HANDLER_NMI ( WatchdogException )
2005-09-26 16:04:21 +10:00
{
printk ( KERN_EMERG " PowerPC Book-E Watchdog Exception \n " ) ;
WatchdogHandler ( regs ) ;
2021-03-16 20:41:59 +10:00
return 0 ;
2005-09-26 16:04:21 +10:00
}
# endif
2005-10-01 18:43:42 +10:00
/*
* We enter here if we discover during exception entry that we are
* running in supervisor mode with a userspace value in the stack pointer .
*/
2021-01-30 23:08:38 +10:00
DEFINE_INTERRUPT_HANDLER ( kernel_bad_stack )
2005-10-01 18:43:42 +10:00
{
printk ( KERN_EMERG " Bad kernel stack pointer %lx at %lx \n " ,
regs - > gpr [ 1 ] , regs - > nip ) ;
die ( " Bad kernel stack pointer " , regs , SIGABRT ) ;
}
2005-09-26 16:04:21 +10:00
void __init trap_init ( void )
{
}
2009-05-18 02:10:05 +00:00
# ifdef CONFIG_PPC_EMULATED_STATS
# define WARN_EMULATED_SETUP(type) .type = { .name = #type }
struct ppc_emulated ppc_emulated = {
# ifdef CONFIG_ALTIVEC
WARN_EMULATED_SETUP ( altivec ) ,
# endif
WARN_EMULATED_SETUP ( dcba ) ,
WARN_EMULATED_SETUP ( dcbz ) ,
WARN_EMULATED_SETUP ( fp_pair ) ,
WARN_EMULATED_SETUP ( isel ) ,
WARN_EMULATED_SETUP ( mcrxr ) ,
WARN_EMULATED_SETUP ( mfpvr ) ,
WARN_EMULATED_SETUP ( multiple ) ,
WARN_EMULATED_SETUP ( popcntb ) ,
WARN_EMULATED_SETUP ( spe ) ,
WARN_EMULATED_SETUP ( string ) ,
2013-10-28 22:07:59 -05:00
WARN_EMULATED_SETUP ( sync ) ,
2009-05-18 02:10:05 +00:00
WARN_EMULATED_SETUP ( unaligned ) ,
# ifdef CONFIG_MATH_EMULATION
WARN_EMULATED_SETUP ( math ) ,
# endif
# ifdef CONFIG_VSX
WARN_EMULATED_SETUP ( vsx ) ,
# endif
2011-03-02 15:18:48 +00:00
# ifdef CONFIG_PPC64
WARN_EMULATED_SETUP ( mfdscr ) ,
WARN_EMULATED_SETUP ( mtdscr ) ,
2014-03-28 17:01:23 +11:00
WARN_EMULATED_SETUP ( lq_stq ) ,
2017-09-15 15:25:48 +10:00
WARN_EMULATED_SETUP ( lxvw4x ) ,
WARN_EMULATED_SETUP ( lxvh8x ) ,
WARN_EMULATED_SETUP ( lxvd2x ) ,
WARN_EMULATED_SETUP ( lxvb16x ) ,
2011-03-02 15:18:48 +00:00
# endif
2009-05-18 02:10:05 +00:00
} ;
u32 ppc_warn_emulated ;
void ppc_warn_emulated_print ( const char * type )
{
2011-06-04 05:36:54 +00:00
pr_warn_ratelimited ( " %s used emulated %s instruction \n " , current - > comm ,
type ) ;
2009-05-18 02:10:05 +00:00
}
static int __init ppc_warn_emulated_init ( void )
{
2020-02-09 11:58:56 +01:00
struct dentry * dir ;
2009-05-18 02:10:05 +00:00
unsigned int i ;
struct ppc_emulated_entry * entries = ( void * ) & ppc_emulated ;
dir = debugfs_create_dir ( " emulated_instructions " ,
2021-08-12 18:58:31 +05:30
arch_debugfs_dir ) ;
2009-05-18 02:10:05 +00:00
2020-02-09 11:58:56 +01:00
debugfs_create_u32 ( " do_warn " , 0644 , dir , & ppc_warn_emulated ) ;
2009-05-18 02:10:05 +00:00
2020-02-09 11:58:56 +01:00
for ( i = 0 ; i < sizeof ( ppc_emulated ) / sizeof ( * entries ) ; i + + )
debugfs_create_u32 ( entries [ i ] . name , 0644 , dir ,
( u32 * ) & entries [ i ] . val . counter ) ;
2009-05-18 02:10:05 +00:00
return 0 ;
}
device_initcall ( ppc_warn_emulated_init ) ;
# endif /* CONFIG_PPC_EMULATED_STATS */