License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.
By default all files without license information are under the default
license of the kernel, which is GPL version 2.
Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.
This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.
How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,
Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.
The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.
The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.
Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if <5
lines).
All documentation files were explicitly excluded.
The following heuristics were used to determine which SPDX license
identifiers to apply.
- when both scanners couldn't find any license traces, file was
considered to have no license information in it, and the top level
COPYING file license applied.
For non */uapi/* files that summary was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 11139
and resulted in the first patch in this series.
If that file was a */uapi/* path one, it was "GPL-2.0 WITH
Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was:
SPDX license identifier # files
---------------------------------------------------|-------
GPL-2.0 WITH Linux-syscall-note 930
and resulted in the second patch in this series.
- if a file had some form of licensing information in it, and was one
of the */uapi/* ones, it was denoted with the Linux-syscall-note if
any GPL family license was found in the file or had no licensing in
it (per prior point). Results summary:
SPDX license identifier # files
---------------------------------------------------|------
GPL-2.0 WITH Linux-syscall-note 270
GPL-2.0+ WITH Linux-syscall-note 169
((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21
((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17
LGPL-2.1+ WITH Linux-syscall-note 15
GPL-1.0+ WITH Linux-syscall-note 14
((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5
LGPL-2.0+ WITH Linux-syscall-note 4
LGPL-2.1 WITH Linux-syscall-note 3
((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3
((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1
and that resulted in the third patch in this series.
- when the two scanners agreed on the detected license(s), that became
the concluded license(s).
- when there was disagreement between the two scanners (one detected a
license but the other didn't, or they both detected different
licenses) a manual inspection of the file occurred.
- In most cases a manual inspection of the information in the file
resulted in a clear resolution of the license that should apply (and
which scanner probably needed to revisit its heuristics).
- When it was not immediately clear, the license identifier was
confirmed with lawyers working with the Linux Foundation.
- If there was any question as to the appropriate license identifier,
the file was flagged for further research and to be revisited later
in time.
In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.
Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights. The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.
Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.
In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.
Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
- a full scancode scan run, collecting the matched texts, detected
license ids and scores
- reviewing anything where there was a license detected (about 500+
files) to ensure that the applied SPDX license was correct
- reviewing anything where there was no detection but the patch license
was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
SPDX license was correct
This produced a worksheet with 20 files needing minor correction. This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.
These .csv files were then reviewed by Greg. Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected. This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.) Finally Greg ran the script using the .csv files to
generate the patches.
Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-01 17:07:57 +03:00
/* SPDX-License-Identifier: GPL-2.0 */
2009-06-03 01:17:38 +04:00
/ *
* This f i l e c o n t a i n s t h e 6 4 - b i t " s e r v e r " P o w e r P C v a r i a n t
* of t h e l o w l e v e l e x c e p t i o n h a n d l i n g i n c l u d i n g e x c e p t i o n
* vectors, e x c e p t i o n r e t u r n , p a r t o f t h e s l b a n d s t a b
* handling a n d o t h e r f i x e d o f f s e t s p e c i f i c t h i n g s .
*
* This f i l e i s m e a n t t o b e #i n c l u d e d f r o m h e a d _ 64 . S d u e t o
2011-03-31 05:57:33 +04:00
* position d e p e n d e n t a s s e m b l y .
2009-06-03 01:17:38 +04:00
*
* Most o f t h i s o r i g i n a t e s f r o m h e a d _ 6 4 . S a n d t h u s h a s t h e s a m e
* copyright h i s t o r y .
*
* /
powerpc: Rework lazy-interrupt handling
The current implementation of lazy interrupts handling has some
issues that this tries to address.
We don't do the various workarounds we need to do when re-enabling
interrupts in some cases such as when returning from an interrupt
and thus we may still lose or get delayed decrementer or doorbell
interrupts.
The current scheme also makes it much harder to handle the external
"edge" interrupts provided by some BookE processors when using the
EPR facility (External Proxy) and the Freescale Hypervisor.
Additionally, we tend to keep interrupts hard disabled in a number
of cases, such as decrementer interrupts, external interrupts, or
when a masked decrementer interrupt is pending. This is sub-optimal.
This is an attempt at fixing it all in one go by reworking the way
we do the lazy interrupt disabling from the ground up.
The base idea is to replace the "hard_enabled" field with a
"irq_happened" field in which we store a bit mask of what interrupt
occurred while soft-disabled.
When re-enabling, either via arch_local_irq_restore() or when returning
from an interrupt, we can now decide what to do by testing bits in that
field.
We then implement replaying of the missed interrupts either by
re-using the existing exception frame (in exception exit case) or via
the creation of a new one from an assembly trampoline (in the
arch_local_irq_enable case).
This removes the need to play with the decrementer to try to create
fake interrupts, among others.
In addition, this adds a few refinements:
- We no longer hard disable decrementer interrupts that occur
while soft-disabled. We now simply bump the decrementer back to max
(on BookS) or leave it stopped (on BookE) and continue with hard interrupts
enabled, which means that we'll potentially get better sample quality from
performance monitor interrupts.
- Timer, decrementer and doorbell interrupts now hard-enable
shortly after removing the source of the interrupt, which means
they no longer run entirely hard disabled. Again, this will improve
perf sample quality.
- On Book3E 64-bit, we now make the performance monitor interrupt
act as an NMI like Book3S (the necessary C code for that to work
appear to already be present in the FSL perf code, notably calling
nmi_enter instead of irq_enter). (This also fixes a bug where BookE
perfmon interrupts could clobber r14 ... oops)
- We could make "masked" decrementer interrupts act as NMIs when doing
timer-based perf sampling to improve the sample quality.
Signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
v2:
- Add hard-enable to decrementer, timer and doorbells
- Fix CR clobber in masked irq handling on BookE
- Make embedded perf interrupt act as an NMI
- Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want
to retrigger an interrupt without preventing hard-enable
v3:
- Fix or vs. ori bug on Book3E
- Fix enabling of interrupts for some exceptions on Book3E
v4:
- Fix resend of doorbells on return from interrupt on Book3E
v5:
- Rebased on top of my latest series, which involves some significant
rework of some aspects of the patch.
v6:
- 32-bit compile fix
- more compile fixes with various .config combos
- factor out the asm code to soft-disable interrupts
- remove the C wrapper around preempt_schedule_irq
v7:
- Fix a bug with hard irq state tracking on native power7
2012-03-06 11:27:59 +04:00
# include < a s m / h w _ i r q . h >
2009-07-15 00:52:52 +04:00
# include < a s m / e x c e p t i o n - 6 4 s . h >
2010-11-18 18:06:17 +03:00
# include < a s m / p t r a c e . h >
2014-12-09 21:56:52 +03:00
# include < a s m / c p u i d l e . h >
2016-09-30 12:43:18 +03:00
# include < a s m / h e a d - 6 4 . h >
2018-07-05 19:25:01 +03:00
# include < a s m / f e a t u r e - f i x u p s . h >
2019-04-18 09:51:24 +03:00
# include < a s m / k u p . h >
2009-07-15 00:52:52 +04:00
2019-06-22 16:15:35 +03:00
/* PACA save area offsets (exgen, exmc, etc) */
# define E X _ R 9 0
# define E X _ R 1 0 8
# define E X _ R 1 1 1 6
# define E X _ R 1 2 2 4
# define E X _ R 1 3 3 2
# define E X _ D A R 4 0
# define E X _ D S I S R 4 8
# define E X _ C C R 5 2
# define E X _ C F A R 5 6
# define E X _ P P R 6 4
# define E X _ C T R 7 2
.if EX_SIZE ! = 1 0
.error " EX_ S I Z E i s w r o n g "
.endif
2019-08-02 13:56:47 +03:00
/ *
* Following a r e f i x e d s e c t i o n h e l p e r m a c r o s .
*
* EXC_ R E A L _ B E G I N / E N D - r e a l , u n r e l o c a t e d e x c e p t i o n v e c t o r s
* EXC_ V I R T _ B E G I N / E N D - v i r t ( A I L ) , u n r e l o c a t e d e x c e p t i o n v e c t o r s
* TRAMP_ R E A L _ B E G I N - r e a l , u n r e l o c a t e d h e l p e r s ( v i r t m a y c a l l t h e s e )
* TRAMP_ V I R T _ B E G I N - v i r t , u n r e l o c h e l p e r s ( i n p r a c t i c e , r e a l c a n u s e )
* EXC_ C O M M O N - A f t e r s w i t c h i n g t o v i r t u a l , r e l o c a t e d m o d e .
* /
2019-08-02 13:56:43 +03:00
# define E X C _ R E A L _ B E G I N ( n a m e , s t a r t , s i z e ) \
FIXED_ S E C T I O N _ E N T R Y _ B E G I N _ L O C A T I O N ( r e a l _ v e c t o r s , e x c _ r e a l _ ## s t a r t # # _ # # n a m e , s t a r t , s i z e )
# define E X C _ R E A L _ E N D ( n a m e , s t a r t , s i z e ) \
FIXED_ S E C T I O N _ E N T R Y _ E N D _ L O C A T I O N ( r e a l _ v e c t o r s , e x c _ r e a l _ ## s t a r t # # _ # # n a m e , s t a r t , s i z e )
# define E X C _ V I R T _ B E G I N ( n a m e , s t a r t , s i z e ) \
FIXED_ S E C T I O N _ E N T R Y _ B E G I N _ L O C A T I O N ( v i r t _ v e c t o r s , e x c _ v i r t _ ## s t a r t # # _ # # n a m e , s t a r t , s i z e )
# define E X C _ V I R T _ E N D ( n a m e , s t a r t , s i z e ) \
FIXED_ S E C T I O N _ E N T R Y _ E N D _ L O C A T I O N ( v i r t _ v e c t o r s , e x c _ v i r t _ ## s t a r t # # _ # # n a m e , s t a r t , s i z e )
# define E X C _ C O M M O N _ B E G I N ( n a m e ) \
USE_ T E X T _ S E C T I O N ( ) ; \
.balign IFETCH_ A L I G N _ B Y T E S ; \
.global name; \
_ ASM_ N O K P R O B E _ S Y M B O L ( n a m e ) ; \
DEFINE_ F I X E D _ S Y M B O L ( n a m e ) ; \
name :
# define T R A M P _ R E A L _ B E G I N ( n a m e ) \
FIXED_ S E C T I O N _ E N T R Y _ B E G I N ( r e a l _ t r a m p o l i n e s , n a m e )
# define T R A M P _ V I R T _ B E G I N ( n a m e ) \
FIXED_ S E C T I O N _ E N T R Y _ B E G I N ( v i r t _ t r a m p o l i n e s , n a m e )
# define E X C _ R E A L _ N O N E ( s t a r t , s i z e ) \
FIXED_ S E C T I O N _ E N T R Y _ B E G I N _ L O C A T I O N ( r e a l _ v e c t o r s , e x c _ r e a l _ ## s t a r t # # _ # # u n u s e d , s t a r t , s i z e ) ; \
FIXED_ S E C T I O N _ E N T R Y _ E N D _ L O C A T I O N ( r e a l _ v e c t o r s , e x c _ r e a l _ ## s t a r t # # _ # # u n u s e d , s t a r t , s i z e )
# define E X C _ V I R T _ N O N E ( s t a r t , s i z e ) \
FIXED_ S E C T I O N _ E N T R Y _ B E G I N _ L O C A T I O N ( v i r t _ v e c t o r s , e x c _ v i r t _ ## s t a r t # # _ # # u n u s e d , s t a r t , s i z e ) ; \
FIXED_ S E C T I O N _ E N T R Y _ E N D _ L O C A T I O N ( v i r t _ v e c t o r s , e x c _ v i r t _ ## s t a r t # # _ # # u n u s e d , s t a r t , s i z e )
2019-06-22 16:15:27 +03:00
/ *
* We' r e s h o r t o n s p a c e a n d t i m e i n t h e e x c e p t i o n p r o l o g , s o w e c a n ' t
* use t h e n o r m a l L O A D _ R E G _ I M M E D I A T E m a c r o t o l o a d t h e a d d r e s s o f l a b e l .
* Instead w e g e t t h e b a s e o f t h e k e r n e l f r o m p a c a - > k e r n e l b a s e a n d o r i n t h e l o w
* part o f l a b e l . T h i s r e q u i r e s t h a t t h e l a b e l b e w i t h i n 6 4 K B o f k e r n e l b a s e , a n d
* that k e r n e l b a s e b e 6 4 K a l i g n e d .
* /
# define L O A D _ H A N D L E R ( r e g , l a b e l ) \
ld r e g ,P A C A K B A S E ( r13 ) ; /* get high part of &label */ \
ori r e g ,r e g ,F I X E D _ S Y M B O L _ A B S _ A D D R ( l a b e l )
# define _ _ L O A D _ H A N D L E R ( r e g , l a b e l ) \
ld r e g ,P A C A K B A S E ( r13 ) ; \
ori r e g ,r e g ,( A B S _ A D D R ( l a b e l ) ) @l
/ *
* Branches f r o m u n r e l o c a t e d c o d e ( e . g . , i n t e r r u p t s ) t o l a b e l s o u t s i d e
* head- y r e q u i r e > 6 4 K o f f s e t s .
* /
# define _ _ L O A D _ F A R _ H A N D L E R ( r e g , l a b e l ) \
ld r e g ,P A C A K B A S E ( r13 ) ; \
ori r e g ,r e g ,( A B S _ A D D R ( l a b e l ) ) @l; \
addis r e g ,r e g ,( A B S _ A D D R ( l a b e l ) ) @h
/ *
* Branch t o l a b e l u s i n g i t s 0 x C 0 0 0 a d d r e s s . T h i s r e s u l t s i n i n s t r u c t i o n
* address s u i t a b l e f o r M S R [ I R ] =0 o r 1 , w h i c h a l l o w s r e l o c a t i o n t o b e t u r n e d
* on u s i n g m t m s r r a t h e r t h a n r f i d .
*
* This c o u l d s e t t h e 0 x c b i t s f o r ! R E L O C A T A B L E a s a n i m m e d i a t e , r a t h e r t h a n
* load K B A S E f o r a s l i g h t o p t i m i s a t i o n .
* /
# define B R A N C H _ T O _ C 0 0 0 ( r e g , l a b e l ) \
powerpc/64s/exception: optimise system_reset for idle, clean up non-idle case
The idle wake up code in the system reset interrupt is not very
optimal. There are two requirements: perform idle wake up quickly;
and save everything including CFAR for non-idle interrupts, with
no performance requirement.
The problem with placing the idle test in the middle of the handler
and using the normal handler code to save CFAR, is that it's quite
costly (e.g., mfcfar is serialising, speculative workarounds get
applied, SRR1 has to be reloaded, etc). It also prevents the standard
interrupt handler boilerplate being used.
This pain can be avoided by using a dedicated idle interrupt handler
at the start of the interrupt handler, which restores all registers
back to the way they were in case it was not an idle wake up. CFAR
is preserved without saving it before the non-idle case by making that
the fall-through, and idle is a taken branch.
Performance seems to be in the noise, but possibly around 0.5% faster,
the executed instructions certainly look better. The bigger benefit is
being able to drop in standard interrupt handlers after the idle code,
which helps with subsequent cleanup and consolidation.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fixup BE by using DOTSYM for idle_return_gpr_loss call]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-28 09:33:19 +03:00
_ _ LOAD_ F A R _ H A N D L E R ( r e g , l a b e l ) ; \
2019-06-22 16:15:27 +03:00
mtctr r e g ; \
bctr
2020-02-25 20:35:10 +03:00
/ *
* Interrupt c o d e g e n e r a t i o n m a c r o s
* /
2020-02-25 20:35:28 +03:00
# define I V E C . L _ I V E C _ \ n a m e \ ( ) / * I n t e r r u p t v e c t o r a d d r e s s * /
# define I H S R R . L _ I H S R R _ \ n a m e \ ( ) / * S e t s S R R o r H S R R r e g i s t e r s * /
# define I H S R R _ I F _ H V M O D E . L _ I H S R R _ I F _ H V M O D E _ \ n a m e \ ( ) / * H S R R i f H V e l s e S R R * /
# define I A R E A . L _ I A R E A _ \ n a m e \ ( ) / * P A C A s a v e a r e a * /
# define I V I R T . L _ I V I R T _ \ n a m e \ ( ) / * H a s v i r t m o d e e n t r y p o i n t * /
# define I I S I D E . L _ I I S I D E _ \ n a m e \ ( ) / * U s e s S R R 0 / 1 n o t D A R / D S I S R * /
# define I D A R . L _ I D A R _ \ n a m e \ ( ) / * U s e s D A R ( o r S R R 0 ) * /
# define I D S I S R . L _ I D S I S R _ \ n a m e \ ( ) / * U s e s D S I S R ( o r S R R 1 ) * /
# define I S E T _ R I . L _ I S E T _ R I _ \ n a m e \ ( ) / * R u n c o m m o n c o d e w / M S R [ R I ] =1 * /
# define I B R A N C H _ T O _ C O M M O N . L _ I B R A N C H _ T O _ C O M M O N _ \ n a m e \ ( ) / * E N T R Y b r a n c h t o c o m m o n * /
# define I R E A L M O D E _ C O M M O N . L _ I R E A L M O D E _ C O M M O N _ \ n a m e \ ( ) / * C o m m o n r u n s i n r e a l m o d e * /
# define I M A S K . L _ I M A S K _ \ n a m e \ ( ) / * I R Q s o f t - m a s k b i t * /
# define I K V M _ S K I P . L _ I K V M _ S K I P _ \ n a m e \ ( ) / * G e n e r a t e K V M s k i p h a n d l e r * /
# define I K V M _ R E A L . L _ I K V M _ R E A L _ \ n a m e \ ( ) / * R e a l e n t r y t e s t s K V M * /
2020-02-25 20:35:14 +03:00
# define _ _ I K V M _ R E A L ( n a m e ) . L _ I K V M _ R E A L _ ## n a m e
2020-02-25 20:35:28 +03:00
# define I K V M _ V I R T . L _ I K V M _ V I R T _ \ n a m e \ ( ) / * V i r t e n t r y t e s t s K V M * /
# define I S T A C K . L _ I S T A C K _ \ n a m e \ ( ) / * S e t r e g u l a r k e r n e l s t a c k * /
2020-02-25 20:35:14 +03:00
# define _ _ I S T A C K ( n a m e ) . L _ I S T A C K _ ## n a m e
2020-02-25 20:35:28 +03:00
# define I R E C O N C I L E . L _ I R E C O N C I L E _ \ n a m e \ ( ) / * D o R E C O N C I L E _ I R Q _ S T A T E * /
# define I K U A P . L _ I K U A P _ \ n a m e \ ( ) / * D o K U A P l o c k * /
2020-02-25 20:35:10 +03:00
# define I N T _ D E F I N E _ B E G I N ( n ) \
.macro int_define_ # # n n a m e
# define I N T _ D E F I N E _ E N D ( n ) \
.endm ; \
int_ d e f i n e _ ## n n ; \
do_ d e f i n e _ i n t n
.macro do_define_int name
.ifndef IVEC
.error " IVEC n o t d e f i n e d "
.endif
.ifndef IHSRR
2020-02-25 20:35:27 +03:00
IHSRR=0
.endif
.ifndef IHSRR_IF_HVMODE
IHSRR_ I F _ H V M O D E =0
2020-02-25 20:35:10 +03:00
.endif
.ifndef IAREA
IAREA=PACA_EXGEN
.endif
2020-02-25 20:35:19 +03:00
.ifndef IVIRT
IVIRT=1
.endif
2020-02-25 20:35:18 +03:00
.ifndef IISIDE
IISIDE=0
.endif
2020-02-25 20:35:10 +03:00
.ifndef IDAR
IDAR=0
.endif
.ifndef IDSISR
IDSISR=0
.endif
.ifndef ISET_RI
ISET_ R I =1
.endif
2020-02-25 20:35:22 +03:00
.ifndef IBRANCH_TO_COMMON
IBRANCH_ T O _ C O M M O N =1
.endif
.ifndef IREALMODE_COMMON
IREALMODE_ C O M M O N =0
.else
.if ! IBRANCH_ T O _ C O M M O N
.error " IREALMODE_ C O M M O N =1 b u t I B R A N C H _ T O _ C O M M O N =0 "
.endif
2020-02-25 20:35:10 +03:00
.endif
.ifndef IMASK
IMASK=0
.endif
2020-02-25 20:35:12 +03:00
.ifndef IKVM_SKIP
IKVM_ S K I P =0
.endif
2020-02-25 20:35:10 +03:00
.ifndef IKVM_REAL
IKVM_ R E A L =0
.endif
.ifndef IKVM_VIRT
IKVM_ V I R T =0
.endif
2020-02-25 20:35:11 +03:00
.ifndef ISTACK
ISTACK=1
.endif
.ifndef IRECONCILE
IRECONCILE=1
.endif
.ifndef IKUAP
IKUAP=1
.endif
2020-02-25 20:35:10 +03:00
.endm
2019-06-22 16:15:27 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ 6 4 _ H A N D L E R
# ifdef C O N F I G _ K V M _ B O O K 3 S _ H V _ P O S S I B L E
/ *
2020-02-25 20:35:29 +03:00
* All i n t e r r u p t s w h i c h s e t H S R R r e g i s t e r s , a s w e l l a s S R E S E T a n d M C E a n d
* syscall w h e n i n v o k e d w i t h " s c 1 " s w i t c h t o M S R [ H V ] =1 ( H V M O D E ) t o b e t a k e n ,
* so t h e y a l l g e n e r a l l y n e e d t o t e s t w h e t h e r t h e y w e r e t a k e n i n g u e s t c o n t e x t .
*
* Note : SRESET a n d M C E m a y a l s o b e s e n t t o t h e g u e s t b y t h e h y p e r v i s o r , a n d b e
* taken w i t h M S R [ H V ] =0 .
*
* Interrupts w h i c h s e t S R R r e g i s t e r s ( w i t h t h e a b o v e e x c e p t i o n s ) d o n o t
* elevate t o M S R [ H V ] =1 m o d e , t h o u g h m o s t c a n b e t a k e n w h e n r u n n i n g w i t h
* MSR[ H V ] =1 ( e . g . , b a r e m e t a l k e r n e l a n d u s e r s p a c e ) . S o t h e s e i n t e r r u p t s d o
* not n e e d t o t e s t w h e t h e r a g u e s t i s r u n n i n g b e c a u s e t h e y g e t d e l i v e r e d t o
* the g u e s t d i r e c t l y , i n c l u d i n g n e s t e d H V K V M g u e s t s .
*
* The e x c e p t i o n i s P R K V M , w h e r e t h e g u e s t r u n s w i t h M S R [ P R ] =1 a n d t h e h o s t
* runs w i t h M S R [ H V ] =0 , s o t h e h o s t t a k e s a l l i n t e r r u p t s o n b e h a l f o f t h e
* guest. P R K V M r u n s w i t h L P C R [ A I L ] =0 w h i c h c a u s e s i n t e r r u p t s t o a l w a y s b e
* delivered t o t h e r e a l - m o d e e n t r y p o i n t , t h e r e f o r e s u c h i n t e r r u p t s o n l y t e s t
* KVM i n t h e i r r e a l m o d e h a n d l e r s , a n d o n l y w h e n P R K V M i s p o s s i b l e .
*
* Interrupts t h a t a r e t a k e n i n M S R [ H V ] =0 a n d e s c a l a t e t o M S R [ H V ] =1 a r e a l w a y s
* delivered i n r e a l - m o d e w h e n t h e M M U i s i n h a s h m o d e b e c a u s e t h e M M U
* registers a r e n o t s e t a p p r o p r i a t e l y t o t r a n s l a t e h o s t a d d r e s s e s . I n n e s t e d
* radix m o d e t h e s e c a n b e d e l i v e r e d i n v i r t - m o d e a s t h e h o s t t r a n s l a t i o n s a r e
* used i m p l i c i t l y ( s e e : e f f e c t i v e L P I D , e f f e c t i v e P I D ) .
* /
/ *
* If a n i n t e r r u p t i s t a k e n w h i l e a g u e s t i s r u n n i n g , i t i s i m m e d i a t e l y r o u t e d
* to K V M t o h a n d l e . I f b o t h H V a n d P R K V M a r e p o s s i b l e , K V M i n t e r r u p t s g o f i r s t
* to k v m p p c _ i n t e r r u p t _ h v , w h i c h h a n d l e s t h e P R g u e s t c a s e .
2019-06-22 16:15:27 +03:00
* /
# define k v m p p c _ i n t e r r u p t k v m p p c _ i n t e r r u p t _ h v
# else
# define k v m p p c _ i n t e r r u p t k v m p p c _ i n t e r r u p t _ p r
# endif
2020-02-25 20:35:24 +03:00
.macro KVMTEST name
2019-06-22 16:15:27 +03:00
lbz r10 ,H S T A T E _ I N _ G U E S T ( r13 )
cmpwi r10 ,0
2019-08-02 13:57:00 +03:00
bne \ n a m e \ ( ) _ k v m
2019-06-22 16:15:27 +03:00
.endm
2020-02-25 20:35:17 +03:00
.macro GEN_KVM name
2020-02-25 20:35:21 +03:00
.balign IFETCH_ALIGN_BYTES
\ name\ ( ) _ k v m :
2020-02-25 20:35:17 +03:00
.if IKVM_SKIP
2019-06-22 16:15:27 +03:00
cmpwi r10 ,K V M _ G U E S T _ M O D E _ S K I P
beq 8 9 f
.else
2020-02-25 20:35:23 +03:00
BEGIN_ F T R _ S E C T I O N
2020-02-25 20:35:17 +03:00
ld r10 ,I A R E A + E X _ C F A R ( r13 )
2019-06-22 16:15:27 +03:00
std r10 ,H S T A T E _ C F A R ( r13 )
2020-02-25 20:35:23 +03:00
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ C F A R )
2019-06-22 16:15:27 +03:00
.endif
2020-06-15 09:12:47 +03:00
ld r10 ,I A R E A + E X _ C T R ( r13 )
2020-02-25 20:35:21 +03:00
mtctr r10
2020-02-25 20:35:23 +03:00
BEGIN_ F T R _ S E C T I O N
2020-02-25 20:35:17 +03:00
ld r10 ,I A R E A + E X _ P P R ( r13 )
2019-06-22 16:15:27 +03:00
std r10 ,H S T A T E _ P P R ( r13 )
2020-02-25 20:35:23 +03:00
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ H A S _ P P R )
2020-02-25 20:35:21 +03:00
ld r11 ,I A R E A + E X _ R 1 1 ( r13 )
ld r12 ,I A R E A + E X _ R 1 2 ( r13 )
2019-06-22 16:15:27 +03:00
std r12 ,H S T A T E _ S C R A T C H 0 ( r13 )
sldi r12 ,r9 ,3 2
2020-02-25 20:35:21 +03:00
ld r9 ,I A R E A + E X _ R 9 ( r13 )
ld r10 ,I A R E A + E X _ R 1 0 ( r13 )
2019-06-22 16:15:27 +03:00
/* HSRR variants have the 0x2 bit added to their trap number */
2020-02-25 20:35:27 +03:00
.if IHSRR_IF_HVMODE
2019-08-02 13:56:44 +03:00
BEGIN_ F T R _ S E C T I O N
2020-02-25 20:35:17 +03:00
ori r12 ,r12 ,( I V E C + 0 x2 )
2019-08-02 13:56:44 +03:00
FTR_ S E C T I O N _ E L S E
2020-02-25 20:35:17 +03:00
ori r12 ,r12 ,( I V E C )
2019-08-02 13:56:44 +03:00
ALT_ F T R _ S E C T I O N _ E N D _ I F S E T ( C P U _ F T R _ H V M O D E | C P U _ F T R _ A R C H _ 2 0 6 )
2020-02-25 20:35:17 +03:00
.elseif IHSRR
ori r12 ,r12 ,( I V E C + 0 x2 )
2019-06-22 16:15:27 +03:00
.else
2020-02-25 20:35:17 +03:00
ori r12 ,r12 ,( I V E C )
2019-06-22 16:15:27 +03:00
.endif
2019-06-22 16:15:29 +03:00
b k v m p p c _ i n t e r r u p t
2019-06-22 16:15:27 +03:00
2020-02-25 20:35:17 +03:00
.if IKVM_SKIP
2019-06-22 16:15:27 +03:00
89 : mtocrf 0 x80 ,r9
2020-06-15 09:12:47 +03:00
ld r10 ,I A R E A + E X _ C T R ( r13 )
2020-02-25 20:35:21 +03:00
mtctr r10
2020-02-25 20:35:17 +03:00
ld r9 ,I A R E A + E X _ R 9 ( r13 )
ld r10 ,I A R E A + E X _ R 1 0 ( r13 )
2020-02-25 20:35:21 +03:00
ld r11 ,I A R E A + E X _ R 1 1 ( r13 )
ld r12 ,I A R E A + E X _ R 1 2 ( r13 )
2020-02-25 20:35:27 +03:00
.if IHSRR_IF_HVMODE
2019-08-02 13:56:44 +03:00
BEGIN_ F T R _ S E C T I O N
b k v m p p c _ s k i p _ H i n t e r r u p t
FTR_ S E C T I O N _ E L S E
b k v m p p c _ s k i p _ i n t e r r u p t
ALT_ F T R _ S E C T I O N _ E N D _ I F S E T ( C P U _ F T R _ H V M O D E | C P U _ F T R _ A R C H _ 2 0 6 )
2020-02-25 20:35:17 +03:00
.elseif IHSRR
2019-06-22 16:15:27 +03:00
b k v m p p c _ s k i p _ H i n t e r r u p t
.else
b k v m p p c _ s k i p _ i n t e r r u p t
.endif
.endif
.endm
# else
2020-02-25 20:35:24 +03:00
.macro KVMTEST name
2019-06-22 16:15:27 +03:00
.endm
2020-02-25 20:35:17 +03:00
.macro GEN_KVM name
2019-06-22 16:15:27 +03:00
.endm
# endif
2019-08-02 13:56:58 +03:00
/ *
* This i s t h e B O O K 3 S i n t e r r u p t e n t r y c o d e m a c r o .
*
* This c a n r e s u l t i n o n e o f s e v e r a l t h i n g s h a p p e n i n g :
* - Branch t o t h e _ c o m m o n h a n d l e r , r e l o c a t e d , i n v i r t u a l m o d e .
* These a r e n o r m a l i n t e r r u p t s ( s y n c h r o n o u s a n d a s y n c h r o n o u s ) h a n d l e d b y
* the k e r n e l .
* - Branch t o K V M , r e l o c a t e d b u t r e a l m o d e i n t e r r u p t s r e m a i n i n r e a l m o d e .
* These o c c u r w h e n H S T A T E _ I N _ G U E S T i s s e t . T h e i n t e r r u p t m a y b e c a u s e d b y
* / intended f o r h o s t o r g u e s t k e r n e l , b u t K V M m u s t a l w a y s b e i n v o l v e d
* because t h e m a c h i n e s t a t e i s s e t f o r g u e s t e x e c u t i o n .
* - Branch t o t h e m a s k e d h a n d l e r , u n r e l o c a t e d .
* These o c c u r w h e n m a s k a b l e a s y n c h r o n o u s i n t e r r u p t s a r e t a k e n w i t h t h e
* irq_ s o f t _ m a s k s e t .
* - Branch t o a n " e a r l y " h a n d l e r i n r e a l m o d e b u t r e l o c a t e d .
* This i s d o n e i f e a r l y =1 . M C E a n d H M I u s e t h e s e t o h a n d l e e r r o r s i n r e a l
* mode.
* - Fall t h r o u g h a n d c o n t i n u e e x e c u t i n g i n r e a l , u n r e l o c a t e d m o d e .
* This i s d o n e i f e a r l y =2 .
* /
2020-02-25 20:35:19 +03:00
.macro GEN_BRANCH_TO_COMMON name, v i r t
2020-02-25 20:35:22 +03:00
.if IREALMODE_COMMON
LOAD_ H A N D L E R ( r10 , \ n a m e \ ( ) _ c o m m o n )
mtctr r10
bctr
.else
2020-02-25 20:35:19 +03:00
.if \ virt
# ifndef C O N F I G _ R E L O C A T A B L E
b \ n a m e \ ( ) _ c o m m o n _ v i r t
# else
LOAD_ H A N D L E R ( r10 , \ n a m e \ ( ) _ c o m m o n _ v i r t )
mtctr r10
bctr
# endif
.else
LOAD_ H A N D L E R ( r10 , \ n a m e \ ( ) _ c o m m o n _ r e a l )
mtctr r10
bctr
.endif
2020-02-25 20:35:22 +03:00
.endif
2020-02-25 20:35:19 +03:00
.endm
2020-02-25 20:35:15 +03:00
.macro GEN_INT_ENTRY name, v i r t , o o l =0
2019-08-02 13:56:58 +03:00
SET_ S C R A T C H 0 ( r13 ) / * s a v e r13 * /
GET_ P A C A ( r13 )
2020-02-25 20:35:15 +03:00
std r9 ,I A R E A + E X _ R 9 ( r13 ) / * s a v e r9 * /
2020-02-25 20:35:23 +03:00
BEGIN_ F T R _ S E C T I O N
mfspr r9 ,S P R N _ P P R
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ H A S _ P P R )
2019-08-02 13:56:58 +03:00
HMT_ M E D I U M
2020-02-25 20:35:15 +03:00
std r10 ,I A R E A + E X _ R 1 0 ( r13 ) / * s a v e r10 - r12 * /
2020-02-25 20:35:23 +03:00
BEGIN_ F T R _ S E C T I O N
mfspr r10 ,S P R N _ C F A R
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ C F A R )
2019-08-02 13:56:58 +03:00
.if \ ool
.if ! \ virt
b t r a m p _ r e a l _ \ n a m e
.pushsection .text
TRAMP_ R E A L _ B E G I N ( t r a m p _ r e a l _ \ n a m e )
.else
b t r a m p _ v i r t _ \ n a m e
.pushsection .text
TRAMP_ V I R T _ B E G I N ( t r a m p _ v i r t _ \ n a m e )
.endif
.endif
2020-02-25 20:35:23 +03:00
BEGIN_ F T R _ S E C T I O N
std r9 ,I A R E A + E X _ P P R ( r13 )
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ H A S _ P P R )
BEGIN_ F T R _ S E C T I O N
std r10 ,I A R E A + E X _ C F A R ( r13 )
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ C F A R )
2019-08-02 13:56:58 +03:00
INTERRUPT_ T O _ K E R N E L
2020-02-25 20:35:19 +03:00
mfctr r10
std r10 ,I A R E A + E X _ C T R ( r13 )
2019-08-02 13:56:58 +03:00
mfcr r9
2020-02-25 20:35:15 +03:00
std r11 ,I A R E A + E X _ R 1 1 ( r13 )
std r12 ,I A R E A + E X _ R 1 2 ( r13 )
2019-08-02 13:56:58 +03:00
/ *
* DAR/ D S I S R , S C R A T C H 0 m u s t b e r e a d b e f o r e s e t t i n g M S R [ R I ] ,
* because a d - s i d e M C E w i l l c l o b b e r t h o s e r e g i s t e r s s o i s
* not r e c o v e r a b l e i f t h e y a r e l i v e .
* /
GET_ S C R A T C H 0 ( r10 )
2020-02-25 20:35:15 +03:00
std r10 ,I A R E A + E X _ R 1 3 ( r13 )
2020-02-25 20:35:18 +03:00
.if IDAR & & ! IISIDE
2020-02-25 20:35:15 +03:00
.if IHSRR
2019-08-02 13:56:58 +03:00
mfspr r10 ,S P R N _ H D A R
.else
mfspr r10 ,S P R N _ D A R
.endif
2020-02-25 20:35:15 +03:00
std r10 ,I A R E A + E X _ D A R ( r13 )
2019-08-02 13:56:58 +03:00
.endif
2020-02-25 20:35:18 +03:00
.if IDSISR & & ! IISIDE
2020-02-25 20:35:15 +03:00
.if IHSRR
2019-08-02 13:56:58 +03:00
mfspr r10 ,S P R N _ H D S I S R
.else
mfspr r10 ,S P R N _ D S I S R
.endif
2020-02-25 20:35:15 +03:00
stw r10 ,I A R E A + E X _ D S I S R ( r13 )
2019-08-02 13:56:58 +03:00
.endif
2020-02-25 20:35:27 +03:00
.if IHSRR_IF_HVMODE
2020-02-25 20:35:19 +03:00
BEGIN_ F T R _ S E C T I O N
mfspr r11 ,S P R N _ H S R R 0 / * s a v e H S R R 0 * /
mfspr r12 ,S P R N _ H S R R 1 / * a n d H S R R 1 * /
FTR_ S E C T I O N _ E L S E
mfspr r11 ,S P R N _ S R R 0 / * s a v e S R R 0 * /
mfspr r12 ,S P R N _ S R R 1 / * a n d S R R 1 * /
ALT_ F T R _ S E C T I O N _ E N D _ I F S E T ( C P U _ F T R _ H V M O D E | C P U _ F T R _ A R C H _ 2 0 6 )
.elseif IHSRR
mfspr r11 ,S P R N _ H S R R 0 / * s a v e H S R R 0 * /
mfspr r12 ,S P R N _ H S R R 1 / * a n d H S R R 1 * /
.else
mfspr r11 ,S P R N _ S R R 0 / * s a v e S R R 0 * /
mfspr r12 ,S P R N _ S R R 1 / * a n d S R R 1 * /
2019-08-02 13:56:58 +03:00
.endif
2020-02-25 20:35:22 +03:00
.if IBRANCH_TO_COMMON
2020-02-25 20:35:19 +03:00
GEN_ B R A N C H _ T O _ C O M M O N \ n a m e \ v i r t
.endif
2019-08-02 13:56:58 +03:00
.if \ ool
.popsection
.endif
.endm
2019-06-22 16:15:34 +03:00
/ *
2020-02-25 20:35:19 +03:00
* _ _ GEN_ C O M M O N _ E N T R Y i s r e q u i r e d t o r e c e i v e t h e b r a n c h f r o m i n t e r r u p t
2020-02-25 20:35:21 +03:00
* entry, e x c e p t i n t h e c a s e o f t h e r e a l - m o d e h a n d l e r s w h i c h r e q u i r e
* _ _ GEN_ R E A L M O D E _ C O M M O N _ E N T R Y .
*
2020-02-25 20:35:19 +03:00
* This s w i t c h e s t o v i r t u a l m o d e a n d s e t s M S R [ R I ] .
2019-06-22 16:15:34 +03:00
* /
2020-02-25 20:35:19 +03:00
.macro __GEN_COMMON_ENTRY name
DEFINE_ F I X E D _ S Y M B O L ( \ n a m e \ ( ) _ c o m m o n _ r e a l )
\ name\ ( ) _ c o m m o n _ r e a l :
2020-02-25 20:35:21 +03:00
.if IKVM_REAL
2020-02-25 20:35:24 +03:00
KVMTEST \ n a m e
2020-02-25 20:35:21 +03:00
.endif
2020-02-25 20:35:19 +03:00
ld r10 ,P A C A K M S R ( r13 ) / * g e t M S R v a l u e f o r k e r n e l * /
/* MSR[RI] is clear iff using SRR regs */
.if IHSRR = = EXC_ H V _ O R _ S T D
BEGIN_ F T R _ S E C T I O N
xori r10 ,r10 ,M S R _ R I
END_ F T R _ S E C T I O N _ I F C L R ( C P U _ F T R _ H V M O D E )
.elseif ! IHSRR
xori r10 ,r10 ,M S R _ R I
.endif
mtmsrd r10
.if IVIRT
2020-02-25 20:35:21 +03:00
.if IKVM_VIRT
b 1 f / * s k i p t h e v i r t t e s t c o m i n g f r o m r e a l * /
.endif
2020-02-25 20:35:19 +03:00
.balign IFETCH_ALIGN_BYTES
DEFINE_ F I X E D _ S Y M B O L ( \ n a m e \ ( ) _ c o m m o n _ v i r t )
\ name\ ( ) _ c o m m o n _ v i r t :
2020-02-25 20:35:21 +03:00
.if IKVM_VIRT
2020-02-25 20:35:24 +03:00
KVMTEST \ n a m e
2020-02-25 20:35:21 +03:00
1 :
.endif
2020-02-25 20:35:19 +03:00
.endif /* IVIRT */
.endm
2020-02-25 20:35:21 +03:00
/ *
* Don' t s w i t c h t o v i r t m o d e . U s e d f o r e a r l y M C E a n d H M I h a n d l e r s t h a t
* want t o r u n i n r e a l m o d e .
* /
.macro __GEN_REALMODE_COMMON_ENTRY name
DEFINE_ F I X E D _ S Y M B O L ( \ n a m e \ ( ) _ c o m m o n _ r e a l )
\ name\ ( ) _ c o m m o n _ r e a l :
.if IKVM_REAL
2020-02-25 20:35:24 +03:00
KVMTEST \ n a m e
2020-02-25 20:35:21 +03:00
.endif
.endm
2020-02-25 20:35:19 +03:00
.macro __GEN_COMMON_BODY name
2020-02-25 20:35:20 +03:00
.if IMASK
2020-06-11 11:12:02 +03:00
.if ! ISTACK
.error " No s u p p o r t f o r m a s k e d i n t e r r u p t t o u s e c u s t o m s t a c k "
.endif
/* If coming from user, skip soft-mask tests. */
andi. r10 ,r12 ,M S R _ P R
bne 2 f
/ * Kernel c o d e r u n n i n g b e l o w _ _ e n d _ i n t e r r u p t s i s i m p l i c i t l y
* soft- m a s k e d * /
LOAD_ H A N D L E R ( r10 , _ _ e n d _ i n t e r r u p t s )
cmpld r11 ,r10
li r10 ,I M A S K
blt- 1 f
/* Test the soft mask state against our interrupt's bit */
2020-02-25 20:35:20 +03:00
lbz r10 ,P A C A I R Q S O F T M A S K ( r13 )
2020-06-11 11:12:02 +03:00
1 : andi. r10 ,r10 ,I M A S K
2020-02-25 20:35:20 +03:00
/* Associate vector numbers with bits in paca->irq_happened */
.if IVEC = = 0 x5 0 0 | | I V E C = = 0 x e a0
li r10 ,P A C A _ I R Q _ E E
.elseif IVEC = = 0 x9 0 0
li r10 ,P A C A _ I R Q _ D E C
.elseif IVEC = = 0 xa0 0 | | I V E C = = 0 x e 8 0
li r10 ,P A C A _ I R Q _ D B E L L
.elseif IVEC = = 0 xe6 0
li r10 ,P A C A _ I R Q _ H M I
.elseif IVEC = = 0 xf0 0
li r10 ,P A C A _ I R Q _ P M I
.else
.abort " Bad m a s k a b l e v e c t o r "
.endif
2020-02-25 20:35:27 +03:00
.if IHSRR_IF_HVMODE
2020-02-25 20:35:20 +03:00
BEGIN_ F T R _ S E C T I O N
bne m a s k e d _ H i n t e r r u p t
FTR_ S E C T I O N _ E L S E
bne m a s k e d _ i n t e r r u p t
ALT_ F T R _ S E C T I O N _ E N D _ I F S E T ( C P U _ F T R _ H V M O D E | C P U _ F T R _ A R C H _ 2 0 6 )
.elseif IHSRR
bne m a s k e d _ H i n t e r r u p t
.else
bne m a s k e d _ i n t e r r u p t
.endif
.endif
2020-02-25 20:35:16 +03:00
.if ISTACK
2019-08-02 13:56:55 +03:00
andi. r10 ,r12 ,M S R _ P R / * S e e i f c o m i n g f r o m u s e r * /
2020-06-11 11:12:02 +03:00
2 : mr r10 ,r1 / * S a v e r1 * /
2019-08-02 13:56:55 +03:00
subi r1 ,r1 ,I N T _ F R A M E _ S I Z E / * a l l o c f r a m e o n k e r n e l s t a c k * /
2019-08-02 13:56:59 +03:00
beq- 1 0 0 f
2019-08-02 13:56:55 +03:00
ld r1 ,P A C A K S A V E ( r13 ) / * k e r n e l s t a c k t o u s e * /
2019-08-02 13:56:59 +03:00
100 : tdgei r1 ,- I N T _ F R A M E _ S I Z E / * t r a p i f r1 i s i n u s e r s p a c e * /
EMIT_ B U G _ E N T R Y 1 0 0 b ,_ _ F I L E _ _ ,_ _ L I N E _ _ ,0
2019-08-02 13:56:55 +03:00
.endif
2019-08-02 13:56:56 +03:00
std r9 ,_ C C R ( r1 ) / * s a v e C R i n s t a c k f r a m e * /
std r11 ,_ N I P ( r1 ) / * s a v e S R R 0 i n s t a c k f r a m e * /
std r12 ,_ M S R ( r1 ) / * s a v e S R R 1 i n s t a c k f r a m e * /
std r10 ,0 ( r1 ) / * m a k e s t a c k c h a i n p o i n t e r * /
std r0 ,G P R 0 ( r1 ) / * s a v e r0 i n s t a c k f r a m e * /
std r10 ,G P R 1 ( r1 ) / * s a v e r1 i n s t a c k f r a m e * /
2019-08-02 13:56:55 +03:00
2020-02-25 20:35:19 +03:00
.if ISET_RI
li r10 ,M S R _ R I
mtmsrd r10 ,1 / * S e t M S R _ R I * /
.endif
2020-02-25 20:35:16 +03:00
.if ISTACK
.if IKUAP
2019-08-02 13:56:55 +03:00
kuap_ s a v e _ a m r _ a n d _ l o c k r9 , r10 , c r1 , c r0
.endif
2019-08-02 13:56:59 +03:00
beq 1 0 1 f / * i f f r o m k e r n e l m o d e * /
2019-08-02 13:56:55 +03:00
ACCOUNT_ C P U _ U S E R _ E N T R Y ( r13 , r9 , r10 )
2020-02-25 20:35:23 +03:00
BEGIN_ F T R _ S E C T I O N
ld r9 ,I A R E A + E X _ P P R ( r13 ) / * R e a d P P R f r o m p a c a * /
std r9 ,_ P P R ( r1 )
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ H A S _ P P R )
2019-08-02 13:56:59 +03:00
101 :
2019-08-02 13:56:55 +03:00
.else
2020-02-25 20:35:16 +03:00
.if IKUAP
2019-08-02 13:56:54 +03:00
kuap_ s a v e _ a m r _ a n d _ l o c k r9 , r10 , c r1
.endif
2019-08-02 13:56:55 +03:00
.endif
2019-08-02 13:56:56 +03:00
/* Save original regs values from save area to stack frame. */
2020-02-25 20:35:16 +03:00
ld r9 ,I A R E A + E X _ R 9 ( r13 ) / * m o v e r9 , r10 t o s t a c k f r a m e * /
ld r10 ,I A R E A + E X _ R 1 0 ( r13 )
2019-08-02 13:56:56 +03:00
std r9 ,G P R 9 ( r1 )
std r10 ,G P R 1 0 ( r1 )
2020-02-25 20:35:16 +03:00
ld r9 ,I A R E A + E X _ R 1 1 ( r13 ) / * m o v e r11 - r13 t o s t a c k f r a m e * /
ld r10 ,I A R E A + E X _ R 1 2 ( r13 )
ld r11 ,I A R E A + E X _ R 1 3 ( r13 )
2019-08-02 13:56:56 +03:00
std r9 ,G P R 1 1 ( r1 )
std r10 ,G P R 1 2 ( r1 )
std r11 ,G P R 1 3 ( r1 )
2020-02-25 20:35:18 +03:00
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
SAVE_ N V G P R S ( r1 )
2020-02-25 20:35:16 +03:00
.if IDAR
2020-02-25 20:35:18 +03:00
.if IISIDE
2019-08-02 13:56:57 +03:00
ld r10 ,_ N I P ( r1 )
.else
2020-02-25 20:35:16 +03:00
ld r10 ,I A R E A + E X _ D A R ( r13 )
2019-08-02 13:56:57 +03:00
.endif
std r10 ,_ D A R ( r1 )
.endif
2020-02-25 20:35:18 +03:00
2020-02-25 20:35:16 +03:00
.if IDSISR
2020-02-25 20:35:18 +03:00
.if IISIDE
2019-08-02 13:56:57 +03:00
ld r10 ,_ M S R ( r1 )
lis r11 ,D S I S R _ S R R 1 _ M A T C H _ 6 4 S @h
and r10 ,r10 ,r11
.else
2020-02-25 20:35:16 +03:00
lwz r10 ,I A R E A + E X _ D S I S R ( r13 )
2019-08-02 13:56:57 +03:00
.endif
std r10 ,_ D S I S R ( r1 )
.endif
2020-02-25 20:35:18 +03:00
2020-02-25 20:35:23 +03:00
BEGIN_ F T R _ S E C T I O N
2020-02-25 20:35:16 +03:00
ld r10 ,I A R E A + E X _ C F A R ( r13 )
2019-08-02 13:56:56 +03:00
std r10 ,O R I G _ G P R 3 ( r1 )
2020-02-25 20:35:23 +03:00
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ C F A R )
2020-02-25 20:35:19 +03:00
ld r10 ,I A R E A + E X _ C T R ( r13 )
2019-08-02 13:56:56 +03:00
std r10 ,_ C T R ( r1 )
std r2 ,G P R 2 ( r1 ) / * s a v e r2 i n s t a c k f r a m e * /
SAVE_ 4 G P R S ( 3 , r1 ) / * s a v e r3 - r6 i n s t a c k f r a m e * /
SAVE_ 2 G P R S ( 7 , r1 ) / * s a v e r7 , r8 i n s t a c k f r a m e * /
mflr r9 / * G e t L R , l a t e r s a v e t o s t a c k * /
ld r2 ,P A C A T O C ( r13 ) / * g e t k e r n e l T O C i n t o r2 * /
std r9 ,_ L I N K ( r1 )
lbz r10 ,P A C A I R Q S O F T M A S K ( r13 )
mfspr r11 ,S P R N _ X E R / * s a v e X E R i n s t a c k f r a m e * /
std r10 ,S O F T E ( r1 )
std r11 ,_ X E R ( r1 )
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
li r9 ,I V E C
2019-08-02 13:56:56 +03:00
std r9 ,_ T R A P ( r1 ) / * s e t t r a p n u m b e r * /
li r10 ,0
ld r11 ,e x c e p t i o n _ m a r k e r @toc(r2)
std r10 ,R E S U L T ( r1 ) / * c l e a r r e g s - > r e s u l t * /
std r11 ,S T A C K _ F R A M E _ O V E R H E A D - 1 6 ( r1 ) / * m a r k t h e f r a m e * /
2019-08-02 13:56:55 +03:00
2020-02-25 20:35:16 +03:00
.if ISTACK
2019-08-02 13:56:55 +03:00
ACCOUNT_ S T O L E N _ T I M E
2019-08-02 13:56:54 +03:00
.endif
2019-08-02 13:56:57 +03:00
2020-02-25 20:35:16 +03:00
.if IRECONCILE
2019-08-02 13:56:57 +03:00
RECONCILE_ I R Q _ S T A T E ( r10 , r11 )
.endif
2019-08-02 13:56:54 +03:00
.endm
2020-02-25 20:35:19 +03:00
/ *
* On e n t r y r13 p o i n t s t o t h e p a c a , r9 - r13 a r e s a v e d i n t h e p a c a ,
* r9 c o n t a i n s t h e s a v e d C R , r11 a n d r12 c o n t a i n t h e s a v e d S R R 0 a n d
* SRR1 , a n d r e l o c a t i o n i s o n .
*
* If s t a c k =0 , t h e n t h e s t a c k i s a l r e a d y s e t i n r1 , a n d r1 i s s a v e d i n r10 .
* PPR s a v e a n d C P U a c c o u n t i n g i s n o t d o n e f o r t h e ! s t a c k c a s e ( X X X w h y n o t ? )
* /
.macro GEN_COMMON name
_ _ GEN_ C O M M O N _ E N T R Y \ n a m e
_ _ GEN_ C O M M O N _ B O D Y \ n a m e
.endm
2019-06-28 08:33:27 +03:00
/ *
* Restore a l l r e g i s t e r s i n c l u d i n g H / S R R 0 / 1 s a v e d i n a s t a c k f r a m e o f a
* standard e x c e p t i o n .
* /
2020-02-25 20:35:27 +03:00
.macro EXCEPTION_RESTORE_REGS hsrr=0
2019-06-28 08:33:27 +03:00
/* Move original SRR0 and SRR1 into the respective regs */
ld r9 ,_ M S R ( r1 )
.if \ hsrr
mtspr S P R N _ H S R R 1 ,r9
.else
mtspr S P R N _ S R R 1 ,r9
.endif
ld r9 ,_ N I P ( r1 )
.if \ hsrr
mtspr S P R N _ H S R R 0 ,r9
.else
mtspr S P R N _ S R R 0 ,r9
.endif
ld r9 ,_ C T R ( r1 )
mtctr r9
ld r9 ,_ X E R ( r1 )
mtxer r9
ld r9 ,_ L I N K ( r1 )
mtlr r9
ld r9 ,_ C C R ( r1 )
mtcr r9
REST_ 8 G P R S ( 2 , r1 )
REST_ 4 G P R S ( 1 0 , r1 )
REST_ G P R ( 0 , r1 )
/* restore original r1. */
ld r1 ,G P R 1 ( r1 )
.endm
2019-06-22 16:15:34 +03:00
# define R U N L A T C H _ O N \
BEGIN_ F T R _ S E C T I O N \
ld r3 , P A C A _ T H R E A D _ I N F O ( r13 ) ; \
ld r4 ,T I _ L O C A L _ F L A G S ( r3 ) ; \
andi. r0 ,r4 ,_ T L F _ R U N L A T C H ; \
beql p p c64 _ r u n l a t c h _ o n _ t r a m p o l i n e ; \
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ C T R L )
2019-06-22 16:15:27 +03:00
/ *
* When t h e i d l e c o d e i n p o w e r4 _ i d l e p u t s t h e C P U i n t o N A P m o d e ,
* it h a s t o d o s o i n a l o o p , a n d r e l i e s o n t h e e x t e r n a l i n t e r r u p t
* and d e c r e m e n t e r i n t e r r u p t e n t r y c o d e t o g e t i t o u t o f t h e l o o p .
* It s e t s t h e _ T L F _ N A P P I N G b i t i n c u r r e n t _ t h r e a d _ i n f o ( ) - > l o c a l _ f l a g s
* to s i g n a l t h a t i t i s i n t h e l o o p a n d n e e d s h e l p t o g e t o u t .
* /
# ifdef C O N F I G _ P P C _ 9 7 0 _ N A P
# define F I N I S H _ N A P \
BEGIN_ F T R _ S E C T I O N \
ld r11 , P A C A _ T H R E A D _ I N F O ( r13 ) ; \
ld r9 ,T I _ L O C A L _ F L A G S ( r11 ) ; \
andi. r10 ,r9 ,_ T L F _ N A P P I N G ; \
bnel p o w e r4 _ f i x u p _ n a p ; \
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ C A N _ N A P )
# else
# define F I N I S H _ N A P
# endif
2009-06-03 01:17:38 +04:00
/ *
2016-09-28 04:31:48 +03:00
* There a r e a f e w c o n s t r a i n t s t o b e c o n c e r n e d w i t h .
* - Real m o d e e x c e p t i o n s c o d e / d a t a m u s t b e l o c a t e d a t t h e i r p h y s i c a l l o c a t i o n .
* - Virtual m o d e e x c e p t i o n s m u s t b e m a p p e d a t t h e i r 0 x c00 0 . . . l o c a t i o n .
* - Fixed l o c a t i o n c o d e m u s t n o t c a l l d i r e c t l y b e y o n d t h e _ _ e n d _ i n t e r r u p t s
* area w h e n b u i l t w i t h C O N F I G _ R E L O C A T A B L E . L O A D _ H A N D L E R / b c t r s e q u e n c e
* must b e u s e d .
* - LOAD_ H A N D L E R t a r g e t s m u s t b e w i t h i n f i r s t 6 4 K o f p h y s i c a l 0 /
* virtual 0 x c00 . . .
* - Conditional b r a n c h t a r g e t s m u s t b e w i t h i n + / - 3 2 K o f c a l l e r .
*
* " Virtual e x c e p t i o n s " r u n w i t h r e l o c a t i o n o n ( M S R _ I R =1 , M S R _ D R =1 ) , a n d
* therefore d o n ' t h a v e t o r u n i n p h y s i c a l l y l o c a t e d c o d e o r r f i d t o
* virtual m o d e k e r n e l c o d e . H o w e v e r o n r e l o c a t a b l e k e r n e l s t h e y d o h a v e
* to b r a n c h t o K E R N E L B A S E o f f s e t b e c a u s e t h e r e s t o f t h e k e r n e l ( o u t s i d e
* the e x c e p t i o n v e c t o r s ) m a y b e l o c a t e d e l s e w h e r e .
*
* Virtual e x c e p t i o n s c o r r e s p o n d w i t h p h y s i c a l , e x c e p t t h e i r e n t r y p o i n t s
* are o f f s e t b y 0 x c00 0 0 0 0 0 0 0 0 0 0 0 0 0 a n d a l s o t e n d t o g e t a n a d d e d 0 x40 0 0
* offset a p p l i e d . V i r t u a l e x c e p t i o n s a r e e n a b l e d w i t h t h e A l t e r n a t e
* Interrupt L o c a t i o n ( A I L ) b i t s e t i n t h e L P C R . H o w e v e r t h i s d o e s n o t
* guarantee t h e y w i l l b e d e l i v e r e d v i r t u a l l y . S o m e c o n d i t i o n s ( s e e t h e I S A )
* cause e x c e p t i o n s t o b e d e l i v e r e d i n r e a l m o d e .
*
2020-06-11 11:12:03 +03:00
* The s c v i n s t r u c t i o n s a r e a s p e c i a l c a s e . T h e y g e t a 0 x30 0 0 o f f s e t a p p l i e d .
* scv e x c e p t i o n s h a v e u n i q u e r e e n t r a n c y p r o p e r t i e s , s e e b e l o w .
*
2016-09-28 04:31:48 +03:00
* It' s i m p o s s i b l e t o r e c e i v e i n t e r r u p t s b e l o w 0 x30 0 v i a A I L .
*
* KVM : None o f t h e v i r t u a l e x c e p t i o n s a r e f r o m t h e g u e s t . A n y t h i n g t h a t
* escalated t o H V =1 f r o m H V =0 i s d e l i v e r e d v i a r e a l m o d e h a n d l e r s .
*
*
2009-06-03 01:17:38 +04:00
* We l a y o u t p h y s i c a l m e m o r y a s f o l l o w s :
* 0 x0 0 0 0 - 0 x00 f f : S e c o n d a r y p r o c e s s o r s p i n c o d e
2016-09-28 04:31:48 +03:00
* 0 x0 1 0 0 - 0 x18 f f : R e a l m o d e p S e r i e s i n t e r r u p t v e c t o r s
2020-06-11 11:12:03 +03:00
* 0 x1 9 0 0 - 0 x2 f f f : R e a l m o d e t r a m p o l i n e s
* 0 x3 0 0 0 - 0 x58 f f : R e l o n ( I R =1 ,D R =1 ) m o d e p S e r i e s i n t e r r u p t v e c t o r s
2016-09-28 04:31:48 +03:00
* 0 x5 9 0 0 - 0 x6 f f f : R e l o n m o d e t r a m p o l i n e s
2009-06-03 01:17:38 +04:00
* 0 x7 0 0 0 - 0 x7 f f f : F W N M I d a t a a r e a
2016-09-28 04:31:48 +03:00
* 0 x8 0 0 0 - . . . . : C o m m o n i n t e r r u p t h a n d l e r s , r e m a i n i n g e a r l y
* setup c o d e , r e s t o f k e r n e l .
2016-09-21 10:44:07 +03:00
*
* We c o u l d r e c l a i m 0 x40 0 0 - 0 x42 f f f o r r e a l m o d e t r a m p o l i n e s i f t h e s p a c e
* is n e c e s s a r y . U n t i l t h e n i t ' s m o r e c o n s i s t e n t t o e x p l i c i t l y p u t V I R T _ N O N E
* vectors t h e r e .
2016-09-28 04:31:48 +03:00
* /
OPEN_ F I X E D _ S E C T I O N ( r e a l _ v e c t o r s , 0 x01 0 0 , 0 x19 0 0 )
2020-06-11 11:12:03 +03:00
OPEN_ F I X E D _ S E C T I O N ( r e a l _ t r a m p o l i n e s , 0 x19 0 0 , 0 x30 0 0 )
OPEN_ F I X E D _ S E C T I O N ( v i r t _ v e c t o r s , 0 x30 0 0 , 0 x59 0 0 )
2016-09-28 04:31:48 +03:00
OPEN_ F I X E D _ S E C T I O N ( v i r t _ t r a m p o l i n e s , 0 x59 0 0 , 0 x70 0 0 )
2019-02-26 11:51:07 +03:00
# ifdef C O N F I G _ P P C _ P O W E R N V
2019-03-01 15:56:36 +03:00
.globl start_real_trampolines
.globl end_real_trampolines
.globl start_virt_trampolines
.globl end_virt_trampolines
2019-02-26 11:51:07 +03:00
# endif
2016-09-28 04:31:48 +03:00
# if d e f i n e d ( C O N F I G _ P P C _ P S E R I E S ) | | d e f i n e d ( C O N F I G _ P P C _ P O W E R N V )
/ *
* Data a r e a r e s e r v e d f o r F W N M I o p t i o n .
* This a d d r e s s ( 0 x70 0 0 ) i s f i x e d b y t h e R P A .
* pseries a n d p o w e r n v n e e d t o k e e p t h e w h o l e p a g e f r o m
* 0 x7 0 0 0 t o 0 x80 0 0 f r e e f o r u s e b y t h e f i r m w a r e
2009-06-03 01:17:38 +04:00
* /
2016-09-28 04:31:48 +03:00
ZERO_ F I X E D _ S E C T I O N ( f w n m i _ p a g e , 0 x70 0 0 , 0 x80 0 0 )
OPEN_ T E X T _ S E C T I O N ( 0 x80 0 0 )
# else
OPEN_ T E X T _ S E C T I O N ( 0 x70 0 0 )
# endif
USE_ F I X E D _ S E C T I O N ( r e a l _ v e c t o r s )
2009-06-03 01:17:38 +04:00
/ *
* This i s t h e s t a r t o f t h e i n t e r r u p t h a n d l e r s f o r p S e r i e s
* This c o d e r u n s w i t h r e l o c a t i o n o f f .
* Code f r o m h e r e t o _ _ e n d _ i n t e r r u p t s g e t s c o p i e d d o w n t o r e a l
* address 0 x10 0 w h e n w e a r e r u n n i n g a r e l o c a t a b l e k e r n e l .
* Therefore a n y r e l a t i v e b r a n c h e s i n t h i s s e c t i o n m u s t o n l y
* branch t o l a b e l s i n t h i s s e c t i o n .
* /
.globl __start_interrupts
__start_interrupts :
2020-06-11 11:12:03 +03:00
/ * *
* Interrupt 0 x30 0 0 - S y s t e m C a l l V e c t o r e d I n t e r r u p t ( s y s c a l l ) .
* This i s a s y n c h r o n o u s i n t e r r u p t i n v o k e d w i t h t h e " s c v " i n s t r u c t i o n . T h e
* system c a l l d o e s n o t a l t e r t h e H V b i t , s o i t i s d i r e c t e d t o t h e O S .
*
* Handling :
* scv i n s t r u c t i o n s e n t e r t h e k e r n e l w i t h o u t c h a n g i n g E E , R I , M E , o r H V .
* In p a r t i c u l a r , t h i s m e a n s w e c a n t a k e a m a s k a b l e i n t e r r u p t a t a n y p o i n t
* in t h e s c v h a n d l e r , w h i c h i s u n l i k e a n y o t h e r i n t e r r u p t . T h i s i s s o l v e d
* by t r e a t i n g t h e i n s t r u c t i o n a d d r e s s e s b e l o w _ _ e n d _ i n t e r r u p t s a s b e i n g
* soft- m a s k e d .
*
* AIL- 0 m o d e s c v e x c e p t i o n s g o t o 0 x17 0 0 0 - 0 x17 f f f , b u t w e s e t A I L - 3 a n d
* ensure s c v i s n e v e r e x e c u t e d w i t h r e l o c a t i o n o f f , w h i c h m e a n s A I L - 0
* should n e v e r h a p p e n .
*
* Before l e a v i n g t h e b e l o w _ _ e n d _ i n t e r r u p t s t e x t , a t l e a s t o f t h e f o l l o w i n g
* must b e t r u e :
* - MSR[ P R ] =1 ( i . e . , r e t u r n t o u s e r s p a c e )
* - MSR_ E E | M S R _ R I i s s e t ( n o r e e n t r a n t e x c e p t i o n s )
* - Standard k e r n e l e n v i r o n m e n t i s s e t u p ( s t a c k , p a c a , e t c )
*
* Call c o n v e n t i o n :
*
* syscall r e g i s t e r c o n v e n t i o n i s i n D o c u m e n t a t i o n / p o w e r p c / s y s c a l l 6 4 - a b i . r s t
* /
EXC_ V I R T _ B E G I N ( s y s t e m _ c a l l _ v e c t o r e d , 0 x30 0 0 , 0 x10 0 0 )
/* SCV 0 */
mr r9 ,r13
GET_ P A C A ( r13 )
mflr r11
mfctr r12
li r10 ,I R Q S _ A L L _ D I S A B L E D
stb r10 ,P A C A I R Q S O F T M A S K ( r13 )
# ifdef C O N F I G _ R E L O C A T A B L E
b s y s t e m _ c a l l _ v e c t o r e d _ t r a m p
# else
b s y s t e m _ c a l l _ v e c t o r e d _ c o m m o n
# endif
nop
/* SCV 1 - 127 */
.rept 127
mr r9 ,r13
GET_ P A C A ( r13 )
mflr r11
mfctr r12
li r10 ,I R Q S _ A L L _ D I S A B L E D
stb r10 ,P A C A I R Q S O F T M A S K ( r13 )
li r0 ,- 1 / * c a u s e f a i l u r e * /
# ifdef C O N F I G _ R E L O C A T A B L E
b s y s t e m _ c a l l _ v e c t o r e d _ s i g i l l _ t r a m p
# else
b s y s t e m _ c a l l _ v e c t o r e d _ s i g i l l
# endif
.endr
EXC_ V I R T _ E N D ( s y s t e m _ c a l l _ v e c t o r e d , 0 x30 0 0 , 0 x10 0 0 )
# ifdef C O N F I G _ R E L O C A T A B L E
TRAMP_ V I R T _ B E G I N ( s y s t e m _ c a l l _ v e c t o r e d _ t r a m p )
_ _ LOAD_ H A N D L E R ( r10 , s y s t e m _ c a l l _ v e c t o r e d _ c o m m o n )
mtctr r10
bctr
TRAMP_ V I R T _ B E G I N ( s y s t e m _ c a l l _ v e c t o r e d _ s i g i l l _ t r a m p )
_ _ LOAD_ H A N D L E R ( r10 , s y s t e m _ c a l l _ v e c t o r e d _ s i g i l l )
mtctr r10
bctr
# endif
2016-09-21 10:44:07 +03:00
/* No virt vectors corresponding with 0x0..0x100 */
2016-12-06 04:41:12 +03:00
EXC_ V I R T _ N O N E ( 0 x40 0 0 , 0 x10 0 )
2016-09-21 10:44:07 +03:00
2016-10-13 05:17:14 +03:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x10 0 - S y s t e m R e s e t I n t e r r u p t ( S R E S E T a k a N M I ) .
* This i s a n o n - m a s k a b l e , a s y n c h r o n o u s i n t e r r u p t a l w a y s t a k e n i n r e a l - m o d e .
* It i s c a u s e d b y :
* - Wake f r o m p o w e r - s a v i n g s t a t e , o n p o w e r n v .
* - An N M I f r o m a n o t h e r C P U , t r i g g e r e d b y f i r m w a r e o r h y p e r c a l l .
* - As c r a s h / d e b u g s i g n a l i n j e c t e d f r o m B M C , f i r m w a r e o r h y p e r v i s o r .
*
* Handling :
* Power- s a v e w a k e u p i s t h e o n l y p e r f o r m a n c e c r i t i c a l p a t h , s o t h i s i s
* determined q u i c k l y a s p o s s i b l e f i r s t . I n t h i s c a s e v o l a t i l e r e g i s t e r s
* can b e d i s c a r d e d a n d S P R s l i k e C F A R d o n ' t n e e d t o b e r e a d .
*
* If n o t a p o w e r s a v e w a k e u p , t h e n i t ' s r u n a s a r e g u l a r i n t e r r u p t , h o w e v e r
* it u s e s i t s o w n s t a c k a n d P A C A s a v e a r e a t o p r e s e r v e t h e r e g u l a r k e r n e l
* environment f o r d e b u g g i n g .
*
* This i n t e r r u p t i s n o t m a s k a b l e , s o t r i g g e r i n g i t w h e n M S R [ R I ] i s c l e a r ,
* or S C R A T C H 0 i s i n u s e , e t c . m a y c a u s e a c r a s h . I t ' s a l s o n o t e n t i r e l y
* correct t o s w i t c h t o v i r t u a l m o d e t o r u n t h e r e g u l a r i n t e r r u p t h a n d l e r
* because i t m i g h t b e i n t e r r u p t e d w h e n t h e M M U i s i n a b a d s t a t e ( e . g . , S L B
* is c l e a r ) .
*
* FWNMI :
* PAPR s p e c i f i e s a " f w n m i " f a c i l i t y w h i c h s e n d s t h e s r e s e t t o a d i f f e r e n t
* entry p o i n t w i t h a d i f f e r e n t r e g i s t e r s e t u p . S o m e h y p e r v i s o r s w i l l
* send t h e s r e s e t t o 0 x10 0 i n t h e g u e s t i f i t i s n o t f w n m i c a p a b l e .
*
* KVM :
* Unlike m o s t S R R i n t e r r u p t s , t h i s m a y b e t a k e n b y t h e h o s t w h i l e e x e c u t i n g
* in a g u e s t , s o a K V M t e s t i s r e q u i r e d . K V M w i l l p u l l t h e C P U o u t o f g u e s t
* mode a n d t h e n r a i s e t h e s r e s e t .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( s y s t e m _ r e s e t )
IVEC=0x100
IAREA=PACA_EXNMI
2020-02-25 20:35:19 +03:00
IVIRT=0 / * n o v i r t e n t r y p o i n t * /
2020-02-25 20:35:14 +03:00
/ *
* MSR_ R I i s n o t e n a b l e d , b e c a u s e P A C A _ E X N M I a n d n m i s t a c k i s
* being u s e d , s o a n e s t e d N M I e x c e p t i o n w o u l d c o r r u p t i t .
* /
ISET_ R I =0
ISTACK=0
IRECONCILE=0
IKVM_ R E A L =1
INT_ D E F I N E _ E N D ( s y s t e m _ r e s e t )
2019-06-22 16:15:15 +03:00
EXC_ R E A L _ B E G I N ( s y s t e m _ r e s e t , 0 x10 0 , 0 x10 0 )
2011-01-24 10:42:41 +03:00
# ifdef C O N F I G _ P P C _ P 7 _ N A P
2016-10-13 05:17:14 +03:00
/ *
* If r u n n i n g n a t i v e o n a r c h 2 . 0 6 o r l a t e r , c h e c k i f w e a r e w a k i n g u p
2017-06-24 20:29:01 +03:00
* from n a p / s l e e p / w i n k l e , a n d b r a n c h t o i d l e h a n d l e r . T h i s t e s t s S R R 1
* bits 4 6 : 4 7 . A n o n - 0 v a l u e i n d i c a t e s t h a t w e a r e c o m i n g f r o m a p o w e r
* saving s t a t e . T h e i d l e w a k e u p h a n d l e r i n i t i a l l y r u n s i n r e a l m o d e ,
* but w e b r a n c h t o t h e 0 x c00 0 . . . a d d r e s s s o w e c a n t u r n o n r e l o c a t i o n
powerpc/64s/exception: optimise system_reset for idle, clean up non-idle case
The idle wake up code in the system reset interrupt is not very
optimal. There are two requirements: perform idle wake up quickly;
and save everything including CFAR for non-idle interrupts, with
no performance requirement.
The problem with placing the idle test in the middle of the handler
and using the normal handler code to save CFAR, is that it's quite
costly (e.g., mfcfar is serialising, speculative workarounds get
applied, SRR1 has to be reloaded, etc). It also prevents the standard
interrupt handler boilerplate being used.
This pain can be avoided by using a dedicated idle interrupt handler
at the start of the interrupt handler, which restores all registers
back to the way they were in case it was not an idle wake up. CFAR
is preserved without saving it before the non-idle case by making that
the fall-through, and idle is a taken branch.
Performance seems to be in the noise, but possibly around 0.5% faster,
the executed instructions certainly look better. The bigger benefit is
being able to drop in standard interrupt handlers after the idle code,
which helps with subsequent cleanup and consolidation.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fixup BE by using DOTSYM for idle_return_gpr_loss call]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-28 09:33:19 +03:00
* with m t m s r d l a t e r , a f t e r S P R s a r e r e s t o r e d .
*
* Careful t o m i n i m i s e c o s t f o r t h e f a s t p a t h ( i d l e w a k e u p ) w h i l e
* also a v o i d i n g c l o b b e r i n g C F A R f o r t h e d e b u g p a t h ( n o n - i d l e ) .
*
* For t h e i d l e w a k e c a s e v o l a t i l e r e g i s t e r s c a n b e c l o b b e r e d , w h i c h
* is w h y w e u s e t h o s e i n i t i a l l y . I f i t t u r n s o u t t o n o t b e a n i d l e
* wake, c a r e f u l l y p u t e v e r y t h i n g b a c k t h e w a y i t w a s , s o w e c a n u s e
* common e x c e p t i o n m a c r o s t o h a n d l e i t .
2011-01-24 10:42:41 +03:00
* /
2019-06-22 16:15:32 +03:00
BEGIN_ F T R _ S E C T I O N
powerpc/64s/exception: optimise system_reset for idle, clean up non-idle case
The idle wake up code in the system reset interrupt is not very
optimal. There are two requirements: perform idle wake up quickly;
and save everything including CFAR for non-idle interrupts, with
no performance requirement.
The problem with placing the idle test in the middle of the handler
and using the normal handler code to save CFAR, is that it's quite
costly (e.g., mfcfar is serialising, speculative workarounds get
applied, SRR1 has to be reloaded, etc). It also prevents the standard
interrupt handler boilerplate being used.
This pain can be avoided by using a dedicated idle interrupt handler
at the start of the interrupt handler, which restores all registers
back to the way they were in case it was not an idle wake up. CFAR
is preserved without saving it before the non-idle case by making that
the fall-through, and idle is a taken branch.
Performance seems to be in the noise, but possibly around 0.5% faster,
the executed instructions certainly look better. The bigger benefit is
being able to drop in standard interrupt handlers after the idle code,
which helps with subsequent cleanup and consolidation.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fixup BE by using DOTSYM for idle_return_gpr_loss call]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-28 09:33:19 +03:00
SET_ S C R A T C H 0 ( r13 )
GET_ P A C A ( r13 )
std r3 ,P A C A _ E X N M I + 0 * 8 ( r13 )
std r4 ,P A C A _ E X N M I + 1 * 8 ( r13 )
std r5 ,P A C A _ E X N M I + 2 * 8 ( r13 )
2019-06-22 16:15:15 +03:00
mfspr r3 ,S P R N _ S R R 1
powerpc/64s/exception: optimise system_reset for idle, clean up non-idle case
The idle wake up code in the system reset interrupt is not very
optimal. There are two requirements: perform idle wake up quickly;
and save everything including CFAR for non-idle interrupts, with
no performance requirement.
The problem with placing the idle test in the middle of the handler
and using the normal handler code to save CFAR, is that it's quite
costly (e.g., mfcfar is serialising, speculative workarounds get
applied, SRR1 has to be reloaded, etc). It also prevents the standard
interrupt handler boilerplate being used.
This pain can be avoided by using a dedicated idle interrupt handler
at the start of the interrupt handler, which restores all registers
back to the way they were in case it was not an idle wake up. CFAR
is preserved without saving it before the non-idle case by making that
the fall-through, and idle is a taken branch.
Performance seems to be in the noise, but possibly around 0.5% faster,
the executed instructions certainly look better. The bigger benefit is
being able to drop in standard interrupt handlers after the idle code,
which helps with subsequent cleanup and consolidation.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fixup BE by using DOTSYM for idle_return_gpr_loss call]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-28 09:33:19 +03:00
mfocrf r4 ,0 x80
rlwinm. r5 ,r3 ,4 7 - 3 1 ,3 0 ,3 1
bne+ s y s t e m _ r e s e t _ i d l e _ w a k e
/* Not powersave wakeup. Restore regs for regular interrupt handler. */
mtocrf 0 x80 ,r4
ld r3 ,P A C A _ E X N M I + 0 * 8 ( r13 )
ld r4 ,P A C A _ E X N M I + 1 * 8 ( r13 )
ld r5 ,P A C A _ E X N M I + 2 * 8 ( r13 )
GET_ S C R A T C H 0 ( r13 )
2019-06-22 16:15:32 +03:00
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ H V M O D E | C P U _ F T R _ A R C H _ 2 0 6 )
2016-10-13 05:17:14 +03:00
# endif
KVM: PPC: Allow book3s_hv guests to use SMT processor modes
This lifts the restriction that book3s_hv guests can only run one
hardware thread per core, and allows them to use up to 4 threads
per core on POWER7. The host still has to run single-threaded.
This capability is advertised to qemu through a new KVM_CAP_PPC_SMT
capability. The return value of the ioctl querying this capability
is the number of vcpus per virtual CPU core (vcore), currently 4.
To use this, the host kernel should be booted with all threads
active, and then all the secondary threads should be offlined.
This will put the secondary threads into nap mode. KVM will then
wake them from nap mode and use them for running guest code (while
they are still offline). To wake the secondary threads, we send
them an IPI using a new xics_wake_cpu() function, implemented in
arch/powerpc/sysdev/xics/icp-native.c. In other words, at this stage
we assume that the platform has a XICS interrupt controller and
we are using icp-native.c to drive it. Since the woken thread will
need to acknowledge and clear the IPI, we also export the base
physical address of the XICS registers using kvmppc_set_xics_phys()
for use in the low-level KVM book3s code.
When a vcpu is created, it is assigned to a virtual CPU core.
The vcore number is obtained by dividing the vcpu number by the
number of threads per core in the host. This number is exported
to userspace via the KVM_CAP_PPC_SMT capability. If qemu wishes
to run the guest in single-threaded mode, it should make all vcpu
numbers be multiples of the number of threads per core.
We distinguish three states of a vcpu: runnable (i.e., ready to execute
the guest), blocked (that is, idle), and busy in host. We currently
implement a policy that the vcore can run only when all its threads
are runnable or blocked. This way, if a vcpu needs to execute elsewhere
in the kernel or in qemu, it can do so without being starved of CPU
by the other vcpus.
When a vcore starts to run, it executes in the context of one of the
vcpu threads. The other vcpu threads all go to sleep and stay asleep
until something happens requiring the vcpu thread to return to qemu,
or to wake up to run the vcore (this can happen when another vcpu
thread goes from busy in host state to blocked).
It can happen that a vcpu goes from blocked to runnable state (e.g.
because of an interrupt), and the vcore it belongs to is already
running. In that case it can start to run immediately as long as
the none of the vcpus in the vcore have started to exit the guest.
We send the next free thread in the vcore an IPI to get it to start
to execute the guest. It synchronizes with the other threads via
the vcore->entry_exit_count field to make sure that it doesn't go
into the guest if the other vcpus are exiting by the time that it
is ready to actually enter the guest.
Note that there is no fixed relationship between the hardware thread
number and the vcpu number. Hardware threads are assigned to vcpus
as they become runnable, so we will always use the lower-numbered
hardware threads in preference to higher-numbered threads if not all
the vcpus in the vcore are runnable, regardless of which vcpus are
runnable.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-06-29 04:23:08 +04:00
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y s y s t e m _ r e s e t , v i r t =0
2016-12-19 21:30:05 +03:00
/ *
powerpc/64s/exception: optimise system_reset for idle, clean up non-idle case
The idle wake up code in the system reset interrupt is not very
optimal. There are two requirements: perform idle wake up quickly;
and save everything including CFAR for non-idle interrupts, with
no performance requirement.
The problem with placing the idle test in the middle of the handler
and using the normal handler code to save CFAR, is that it's quite
costly (e.g., mfcfar is serialising, speculative workarounds get
applied, SRR1 has to be reloaded, etc). It also prevents the standard
interrupt handler boilerplate being used.
This pain can be avoided by using a dedicated idle interrupt handler
at the start of the interrupt handler, which restores all registers
back to the way they were in case it was not an idle wake up. CFAR
is preserved without saving it before the non-idle case by making that
the fall-through, and idle is a taken branch.
Performance seems to be in the noise, but possibly around 0.5% faster,
the executed instructions certainly look better. The bigger benefit is
being able to drop in standard interrupt handlers after the idle code,
which helps with subsequent cleanup and consolidation.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fixup BE by using DOTSYM for idle_return_gpr_loss call]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-28 09:33:19 +03:00
* In t h e o r y , w e s h o u l d n o t e n a b l e r e l o c a t i o n h e r e i f i t w a s d i s a b l e d
* in S R R 1 , b e c a u s e t h e M M U m a y n o t b e c o n f i g u r e d t o s u p p o r t i t ( e . g . ,
* SLB m a y h a v e b e e n c l e a r e d ) . I n p r a c t i c e , t h e r e s h o u l d o n l y b e a f e w
* small w i n d o w s w h e r e t h a t ' s t h e c a s e , a n d s r e s e t i s c o n s i d e r e d t o
* be d a n g e r o u s a n y w a y .
2016-12-19 21:30:05 +03:00
* /
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ E N D ( s y s t e m _ r e s e t , 0 x10 0 , 0 x10 0 )
EXC_ V I R T _ N O N E ( 0 x41 0 0 , 0 x10 0 )
2016-10-13 05:17:14 +03:00
# ifdef C O N F I G _ P P C _ P 7 _ N A P
powerpc/64s/exception: optimise system_reset for idle, clean up non-idle case
The idle wake up code in the system reset interrupt is not very
optimal. There are two requirements: perform idle wake up quickly;
and save everything including CFAR for non-idle interrupts, with
no performance requirement.
The problem with placing the idle test in the middle of the handler
and using the normal handler code to save CFAR, is that it's quite
costly (e.g., mfcfar is serialising, speculative workarounds get
applied, SRR1 has to be reloaded, etc). It also prevents the standard
interrupt handler boilerplate being used.
This pain can be avoided by using a dedicated idle interrupt handler
at the start of the interrupt handler, which restores all registers
back to the way they were in case it was not an idle wake up. CFAR
is preserved without saving it before the non-idle case by making that
the fall-through, and idle is a taken branch.
Performance seems to be in the noise, but possibly around 0.5% faster,
the executed instructions certainly look better. The bigger benefit is
being able to drop in standard interrupt handlers after the idle code,
which helps with subsequent cleanup and consolidation.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
[mpe: Fixup BE by using DOTSYM for idle_return_gpr_loss call]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-28 09:33:19 +03:00
TRAMP_ R E A L _ B E G I N ( s y s t e m _ r e s e t _ i d l e _ w a k e )
/* We are waking up from idle, so may clobber any volatile register */
cmpwi c r1 ,r5 ,2
bltlr c r1 / * n o s t a t e l o s s , r e t u r n t o i d l e c a l l e r w i t h r3 =SRR1 * /
BRANCH_ T O _ C 0 0 0 ( r12 , D O T S Y M ( i d l e _ r e t u r n _ g p r _ l o s s ) )
KVM: PPC: Allow book3s_hv guests to use SMT processor modes
This lifts the restriction that book3s_hv guests can only run one
hardware thread per core, and allows them to use up to 4 threads
per core on POWER7. The host still has to run single-threaded.
This capability is advertised to qemu through a new KVM_CAP_PPC_SMT
capability. The return value of the ioctl querying this capability
is the number of vcpus per virtual CPU core (vcore), currently 4.
To use this, the host kernel should be booted with all threads
active, and then all the secondary threads should be offlined.
This will put the secondary threads into nap mode. KVM will then
wake them from nap mode and use them for running guest code (while
they are still offline). To wake the secondary threads, we send
them an IPI using a new xics_wake_cpu() function, implemented in
arch/powerpc/sysdev/xics/icp-native.c. In other words, at this stage
we assume that the platform has a XICS interrupt controller and
we are using icp-native.c to drive it. Since the woken thread will
need to acknowledge and clear the IPI, we also export the base
physical address of the XICS registers using kvmppc_set_xics_phys()
for use in the low-level KVM book3s code.
When a vcpu is created, it is assigned to a virtual CPU core.
The vcore number is obtained by dividing the vcpu number by the
number of threads per core in the host. This number is exported
to userspace via the KVM_CAP_PPC_SMT capability. If qemu wishes
to run the guest in single-threaded mode, it should make all vcpu
numbers be multiples of the number of threads per core.
We distinguish three states of a vcpu: runnable (i.e., ready to execute
the guest), blocked (that is, idle), and busy in host. We currently
implement a policy that the vcore can run only when all its threads
are runnable or blocked. This way, if a vcpu needs to execute elsewhere
in the kernel or in qemu, it can do so without being starved of CPU
by the other vcpus.
When a vcore starts to run, it executes in the context of one of the
vcpu threads. The other vcpu threads all go to sleep and stay asleep
until something happens requiring the vcpu thread to return to qemu,
or to wake up to run the vcore (this can happen when another vcpu
thread goes from busy in host state to blocked).
It can happen that a vcpu goes from blocked to runnable state (e.g.
because of an interrupt), and the vcore it belongs to is already
running. In that case it can start to run immediately as long as
the none of the vcpus in the vcore have started to exit the guest.
We send the next free thread in the vcore an IPI to get it to start
to execute the guest. It synchronizes with the other threads via
the vcore->entry_exit_count field to make sure that it doesn't go
into the guest if the other vcpus are exiting by the time that it
is ready to actually enter the guest.
Note that there is no fixed relationship between the hardware thread
number and the vcpu number. Hardware threads are assigned to vcpus
as they become runnable, so we will always use the lower-numbered
hardware threads in preference to higher-numbered threads if not all
the vcpus in the vcore are runnable, regardless of which vcpus are
runnable.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
2011-06-29 04:23:08 +04:00
# endif
2019-06-28 09:33:20 +03:00
# ifdef C O N F I G _ P P C _ P S E R I E S
/ *
* Vectors f o r t h e F W N M I o p t i o n . S h a r e c o m m o n c o d e .
* /
TRAMP_ R E A L _ B E G I N ( s y s t e m _ r e s e t _ f w n m i )
2020-02-25 20:35:28 +03:00
/* XXX: fwnmi guest could run a nested/PR guest, so why no test? */
2020-02-25 20:35:14 +03:00
_ _ IKVM_ R E A L ( s y s t e m _ r e s e t ) =0
GEN_ I N T _ E N T R Y s y s t e m _ r e s e t , v i r t =0
2019-06-28 09:33:20 +03:00
# endif / * C O N F I G _ P P C _ P S E R I E S * /
2016-12-19 21:30:04 +03:00
EXC_ C O M M O N _ B E G I N ( s y s t e m _ r e s e t _ c o m m o n )
2020-02-25 20:35:19 +03:00
_ _ GEN_ C O M M O N _ E N T R Y s y s t e m _ r e s e t
2016-12-19 21:30:05 +03:00
/ *
* Increment p a c a - > i n _ n m i t h e n e n a b l e M S R _ R I . S L B o r M C E w i l l b e a b l e
* to r e c o v e r , b u t n e s t e d N M I w i l l n o t i c e i n _ n m i a n d n o t r e c o v e r
* because o f t h e u s e o f t h e N M I s t a c k . i n _ n m i r e e n t r a n c y i s t e s t e d i n
* system_ r e s e t _ e x c e p t i o n .
* /
lhz r10 ,P A C A _ I N _ N M I ( r13 )
addi r10 ,r10 ,1
sth r10 ,P A C A _ I N _ N M I ( r13 )
li r10 ,M S R _ R I
mtmsrd r10 ,1
2014-02-26 04:08:25 +04:00
2016-12-19 21:30:06 +03:00
mr r10 ,r1
ld r1 ,P A C A _ N M I _ E M E R G _ S P ( r13 )
subi r1 ,r1 ,I N T _ F R A M E _ S I Z E
2020-02-25 20:35:19 +03:00
_ _ GEN_ C O M M O N _ B O D Y s y s t e m _ r e s e t
2019-06-22 16:15:21 +03:00
/ *
2020-02-25 20:35:30 +03:00
* Set I R Q S _ A L L _ D I S A B L E D u n c o n d i t i o n a l l y s o i r q s _ d i s a b l e d ( ) d o e s
2019-06-22 16:15:21 +03:00
* the r i g h t t h i n g . W e d o n o t w a n t t o r e c o n c i l e b e c a u s e t h a t g o e s
* through i r q t r a c i n g w h i c h w e d o n ' t w a n t i n N M I .
*
2020-05-08 07:33:55 +03:00
* Save P A C A I R Q H A P P E N E D t o R E S U L T ( o t h e r w i s e u n u s e d ) , a n d s e t H A R D _ D I S
2020-02-25 20:35:30 +03:00
* as w e a r e r u n n i n g w i t h M S R [ E E ] =0 .
2019-06-22 16:15:21 +03:00
* /
li r10 ,I R Q S _ A L L _ D I S A B L E D
stb r10 ,P A C A I R Q S O F T M A S K ( r13 )
lbz r10 ,P A C A I R Q H A P P E N E D ( r13 )
2020-05-08 07:33:55 +03:00
std r10 ,R E S U L T ( r1 )
2020-02-25 20:35:30 +03:00
ori r10 ,r10 ,P A C A _ I R Q _ H A R D _ D I S
stb r10 ,P A C A I R Q H A P P E N E D ( r13 )
2019-06-22 16:15:21 +03:00
2019-06-22 16:15:20 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl s y s t e m _ r e s e t _ e x c e p t i o n
2018-03-26 18:01:03 +03:00
/* Clear MSR_RI before setting SRR0 and SRR1. */
2019-06-28 08:33:22 +03:00
li r9 ,0
2018-03-26 18:01:03 +03:00
mtmsrd r9 ,1
2016-12-19 21:30:05 +03:00
/ *
2018-03-26 18:01:03 +03:00
* MSR_ R I i s c l e a r , n o w w e c a n d e c r e m e n t p a c a - > i n _ n m i .
2016-12-19 21:30:05 +03:00
* /
lhz r10 ,P A C A _ I N _ N M I ( r13 )
subi r10 ,r10 ,1
sth r10 ,P A C A _ I N _ N M I ( r13 )
2018-03-26 18:01:03 +03:00
/ *
* Restore s o f t m a s k s e t t i n g s .
* /
2020-05-08 07:33:55 +03:00
ld r10 ,R E S U L T ( r1 )
2018-03-26 18:01:03 +03:00
stb r10 ,P A C A I R Q H A P P E N E D ( r13 )
ld r10 ,S O F T E ( r1 )
stb r10 ,P A C A I R Q S O F T M A S K ( r13 )
2020-04-29 09:56:54 +03:00
kuap_ r e s t o r e _ a m r r9 , r10
2020-02-25 20:35:27 +03:00
EXCEPTION_ R E S T O R E _ R E G S
2018-03-26 18:01:03 +03:00
RFI_ T O _ U S E R _ O R _ K E R N E L
2016-09-21 10:43:30 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M s y s t e m _ r e s e t
2009-06-03 01:17:38 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x20 0 - M a c h i n e C h e c k I n t e r r u p t ( M C E ) .
* This i s a n o n - m a s k a b l e i n t e r r u p t a l w a y s t a k e n i n r e a l - m o d e . I t c a n b e
* synchronous o r a s y n c h r o n o u s , c a u s e d b y h a r d w a r e o r s o f t w a r e , a n d i t m a y b e
* taken i n a p o w e r - s a v i n g s t a t e .
*
* Handling :
* Similarly t o s y s t e m r e s e t , t h i s u s e s i t s o w n s t a c k a n d P A C A s a v e a r e a ,
* the d i f f e r e n c e i s r e - e n t r a n c y i s a l l o w e d o n t h e m a c h i n e c h e c k s t a c k .
*
* machine_ c h e c k _ e a r l y i s r u n i n r e a l m o d e , a n d c a r e f u l l y d e c o d e s t h e
* machine c h e c k a n d t r i e s t o h a n d l e i t ( e . g . , f l u s h t h e S L B i f t h e r e w a s a n
* error d e t e c t e d t h e r e ) , d e t e r m i n e s i f i t w a s r e c o v e r a b l e a n d l o g s t h e
* event.
*
2020-02-25 20:35:30 +03:00
* This e a r l y c o d e d o e s n o t " r e c o n c i l e " i r q s o f t - m a s k s t a t e l i k e S R E S E T o r
* regular i n t e r r u p t s d o , s o i r q s _ d i s a b l e d ( ) a m o n g o t h e r t h i n g s m a y n o t w o r k
* properly ( i r q d i s a b l e / e n a b l e a l r e a d y d o e s n ' t w o r k b e c a u s e i r q t r a c i n g c a n
* not w o r k i n r e a l m o d e ) .
*
2020-02-25 20:35:28 +03:00
* Then, d e p e n d i n g o n t h e e x e c u t i o n c o n t e x t w h e n t h e i n t e r r u p t i s t a k e n , t h e r e
* are 3 m a i n a c t i o n s :
* - Executing i n k e r n e l m o d e . T h e e v e n t i s q u e u e d w i t h i r q _ w o r k , w h i c h m e a n s
* it i s h a n d l e d w h e n i t i s n e x t s a f e t o d o s o ( i . e . , t h e k e r n e l h a s e n a b l e d
* interrupts) , w h i c h c o u l d b e i m m e d i a t e l y w h e n t h e i n t e r r u p t r e t u r n s . T h i s
* avoids n a s t y i s s u e s l i k e s w i t c h i n g t o v i r t u a l m o d e w h e n t h e M M U i s i n a
* bad s t a t e , o r w h e n e x e c u t i n g O P A L c o d e . ( S R E S E T i s e x p o s e d t o s u c h i s s u e s ,
* but i t h a s d i f f e r e n t p r i o r i t i e s ) . C h e c k t o s e e i f t h e C P U w a s i n p o w e r
* save, a n d r e t u r n v i a t h e w a k e u p c o d e i f i t w a s .
*
* - Executing i n u s e r m o d e . m a c h i n e _ c h e c k _ e x c e p t i o n i s r u n l i k e a n o r m a l
* interrupt h a n d l e r , w h i c h p r o c e s s e s t h e d a t a g e n e r a t e d b y t h e e a r l y h a n d l e r .
*
* - Executing i n g u e s t m o d e . T h e i n t e r r u p t i s r u n w i t h i t s K V M t e s t , a n d
* branches t o K V M t o d e a l w i t h . K V M m a y q u e u e t h e e v e n t f o r t h e h o s t
* to r e p o r t l a t e r .
*
* This i n t e r r u p t i s n o t m a s k a b l e , s o i f i t t r i g g e r s w h e n M S R [ R I ] i s c l e a r ,
* or S C R A T C H 0 i s i n u s e , i t m a y c a u s e a c r a s h .
*
* KVM :
* See S R E S E T .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( m a c h i n e _ c h e c k _ e a r l y )
IVEC=0x200
IAREA=PACA_EXMC
2020-02-25 20:35:19 +03:00
IVIRT=0 / * n o v i r t e n t r y p o i n t * /
2020-02-25 20:35:22 +03:00
IREALMODE_ C O M M O N =1
2019-08-02 13:56:36 +03:00
/ *
* MSR_ R I i s n o t e n a b l e d , b e c a u s e P A C A _ E X M C i s b e i n g u s e d , s o a
* nested m a c h i n e c h e c k c o r r u p t s i t . m a c h i n e _ c h e c k _ c o m m o n e n a b l e s
* MSR_ R I .
* /
2020-02-25 20:35:14 +03:00
ISET_ R I =0
ISTACK=0
IDAR=1
IDSISR=1
IRECONCILE=0
IKUAP=0 / * W e d o n ' t t o u c h A M R h e r e , w e n e v e r g o t o v i r t u a l m o d e * /
INT_ D E F I N E _ E N D ( m a c h i n e _ c h e c k _ e a r l y )
INT_ D E F I N E _ B E G I N ( m a c h i n e _ c h e c k )
IVEC=0x200
IAREA=PACA_EXMC
2020-02-25 20:35:19 +03:00
IVIRT=0 / * n o v i r t e n t r y p o i n t * /
2020-02-25 20:35:14 +03:00
ISET_ R I =0
IDAR=1
IDSISR=1
IKVM_ S K I P =1
IKVM_ R E A L =1
INT_ D E F I N E _ E N D ( m a c h i n e _ c h e c k )
EXC_ R E A L _ B E G I N ( m a c h i n e _ c h e c k , 0 x20 0 , 0 x10 0 )
GEN_ I N T _ E N T R Y m a c h i n e _ c h e c k _ e a r l y , v i r t =0
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ E N D ( m a c h i n e _ c h e c k , 0 x20 0 , 0 x10 0 )
EXC_ V I R T _ N O N E ( 0 x42 0 0 , 0 x10 0 )
2019-08-02 13:56:36 +03:00
2019-08-02 13:56:37 +03:00
# ifdef C O N F I G _ P P C _ P S E R I E S
TRAMP_ R E A L _ B E G I N ( m a c h i n e _ c h e c k _ f w n m i )
/* See comment at machine_check exception, don't turn on RI */
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y m a c h i n e _ c h e c k _ e a r l y , v i r t =0
2019-08-02 13:56:37 +03:00
# endif
2019-08-02 13:56:40 +03:00
# define M A C H I N E _ C H E C K _ H A N D L E R _ W I N D U P \
/* Clear MSR_RI before setting SRR0 and SRR1. */ \
li r9 ,0 ; \
mtmsrd r9 ,1 ; /* Clear MSR_RI */ \
/* Decrement paca->in_mce now RI is clear. */ \
lhz r12 ,P A C A _ I N _ M C E ( r13 ) ; \
subi r12 ,r12 ,1 ; \
sth r12 ,P A C A _ I N _ M C E ( r13 ) ; \
2020-02-25 20:35:27 +03:00
EXCEPTION_ R E S T O R E _ R E G S
2019-08-02 13:56:40 +03:00
2019-08-02 13:56:36 +03:00
EXC_ C O M M O N _ B E G I N ( m a c h i n e _ c h e c k _ e a r l y _ c o m m o n )
2020-02-25 20:35:21 +03:00
_ _ GEN_ R E A L M O D E _ C O M M O N _ E N T R Y m a c h i n e _ c h e c k _ e a r l y
2016-09-21 10:43:31 +03:00
/ *
* Switch t o m c _ e m e r g e n c y s t a c k a n d h a n d l e r e - e n t r a n c y ( w e l i m i t
* the n e s t e d M C E u p t o l e v e l 4 t o a v o i d s t a c k o v e r f l o w ) .
* Save M C E r e g i s t e r s s r r1 , s r r0 , d a r a n d d s i s r a n d t h e n s e t M E =1
*
* We u s e p a c a - > i n _ m c e t o c h e c k w h e t h e r t h i s i s t h e f i r s t e n t r y o r
* nested m a c h i n e c h e c k . W e i n c r e m e n t p a c a - > i n _ m c e t o t r a c k n e s t e d
* machine c h e c k s .
*
* If t h i s i s t h e f i r s t e n t r y t h e n s e t s t a c k p o i n t e r t o
* paca- > m c _ e m e r g e n c y _ s p , o t h e r w i s e r1 i s a l r e a d y p o i n t i n g t o
* stack f r a m e o n m c _ e m e r g e n c y s t a c k .
*
* NOTE : We a r e h e r e w i t h M S R _ M E =0 ( o f f ) , w h i c h m e a n s w e r i s k a
* checkstop i f w e g e t a n o t h e r m a c h i n e c h e c k e x c e p t i o n b e f o r e w e d o
* rfid w i t h M S R _ M E =1 .
2017-04-19 16:05:47 +03:00
*
* This i n t e r r u p t c a n w a k e d i r e c t l y f r o m i d l e . I f t h a t i s t h e c a s e ,
* the m a c h i n e c h e c k i s h a n d l e d t h e n t h e i d l e w a k e u p c o d e i s c a l l e d
2018-07-05 11:47:00 +03:00
* to r e s t o r e s t a t e .
2016-09-21 10:43:31 +03:00
* /
lhz r10 ,P A C A _ I N _ M C E ( r13 )
cmpwi r10 ,0 / * A r e w e i n n e s t e d m a c h i n e c h e c k * /
2019-08-02 13:56:36 +03:00
cmpwi c r1 ,r10 ,M A X _ M C E _ D E P T H / * A r e w e a t m a x i m u m n e s t i n g * /
2016-09-21 10:43:31 +03:00
addi r10 ,r10 ,1 / * i n c r e m e n t p a c a - > i n _ m c e * /
sth r10 ,P A C A _ I N _ M C E ( r13 )
2019-08-02 13:56:36 +03:00
mr r10 ,r1 / * S a v e r1 * /
bne 1 f
/* First machine check entry */
ld r1 ,P A C A M C E M E R G S P ( r13 ) / * U s e M C e m e r g e n c y s t a c k * /
2019-08-02 13:56:39 +03:00
1 : /* Limit nested MCE to level 4 to avoid stack overflow */
bgt c r1 ,u n r e c o v e r a b l e _ m c e / * C h e c k i f w e h i t l i m i t o f 4 * /
subi r1 ,r1 ,I N T _ F R A M E _ S I Z E / * a l l o c s t a c k f r a m e * /
2019-08-02 13:56:36 +03:00
2020-02-25 20:35:19 +03:00
_ _ GEN_ C O M M O N _ B O D Y m a c h i n e _ c h e c k _ e a r l y
2019-08-02 13:56:36 +03:00
2018-09-11 17:27:23 +03:00
BEGIN_ F T R _ S E C T I O N
2019-08-02 13:56:38 +03:00
bl e n a b l e _ m a c h i n e _ c h e c k
2018-09-11 17:27:23 +03:00
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ H V M O D E )
2019-08-02 13:56:38 +03:00
li r10 ,M S R _ R I
mtmsrd r10 ,1
2020-05-08 07:33:56 +03:00
/ *
* Set I R Q S _ A L L _ D I S A B L E D a n d s a v e P A C A I R Q H A P P E N E D ( s e e
* system_ r e s e t _ c o m m o n )
* /
li r10 ,I R Q S _ A L L _ D I S A B L E D
stb r10 ,P A C A I R Q S O F T M A S K ( r13 )
lbz r10 ,P A C A I R Q H A P P E N E D ( r13 )
std r10 ,R E S U L T ( r1 )
ori r10 ,r10 ,P A C A _ I R Q _ H A R D _ D I S
stb r10 ,P A C A I R Q H A P P E N E D ( r13 )
2016-09-21 10:43:31 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl m a c h i n e _ c h e c k _ e a r l y
std r3 ,R E S U L T ( r1 ) / * S a v e r e s u l t * /
ld r12 ,_ M S R ( r1 )
2017-04-19 16:05:47 +03:00
2020-05-08 07:33:56 +03:00
/ *
* Restore s o f t m a s k s e t t i n g s .
* /
ld r10 ,R E S U L T ( r1 )
stb r10 ,P A C A I R Q H A P P E N E D ( r13 )
ld r10 ,S O F T E ( r1 )
stb r10 ,P A C A I R Q S O F T M A S K ( r13 )
2019-08-02 13:56:28 +03:00
# ifdef C O N F I G _ P P C _ P 7 _ N A P
2016-09-21 10:43:31 +03:00
/ *
* Check i f t h r e a d w a s i n p o w e r s a v i n g m o d e . W e c o m e h e r e w h e n a n y
* of t h e f o l l o w i n g i s t r u e :
* a. t h r e a d w a s n ' t i n p o w e r s a v i n g m o d e
* b. t h r e a d w a s i n p o w e r s a v i n g m o d e w i t h n o s t a t e l o s s ,
* supervisor s t a t e l o s s o r h y p e r v i s o r s t a t e l o s s .
*
* Go b a c k t o n a p / s l e e p / w i n k l e m o d e a g a i n i f ( b ) i s t r u e .
* /
2019-06-22 16:15:32 +03:00
BEGIN_ F T R _ S E C T I O N
2017-04-19 16:05:47 +03:00
rlwinm. r11 ,r12 ,4 7 - 3 1 ,3 0 ,3 1
2017-05-04 13:41:12 +03:00
bne m a c h i n e _ c h e c k _ i d l e _ c o m m o n
2019-06-22 16:15:32 +03:00
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ H V M O D E | C P U _ F T R _ A R C H _ 2 0 6 )
2016-09-21 10:43:31 +03:00
# endif
2017-04-19 16:05:47 +03:00
2016-09-21 10:43:31 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ 6 4 _ H A N D L E R
/ *
2019-08-02 13:56:28 +03:00
* Check i f w e a r e c o m i n g f r o m g u e s t . I f y e s , t h e n r u n t h e n o r m a l
2019-08-02 13:57:00 +03:00
* exception h a n d l e r w h i c h w i l l t a k e t h e
* machine_ c h e c k _ k v m - > k v m p p c _ i n t e r r u p t b r a n c h t o d e l i v e r t h e M C e v e n t
* to g u e s t .
2016-09-21 10:43:31 +03:00
* /
lbz r11 ,H S T A T E _ I N _ G U E S T ( r13 )
cmpwi r11 ,0 / * C h e c k i f c o m i n g f r o m g u e s t * /
2019-08-02 13:56:41 +03:00
bne m c e _ d e l i v e r / * c o n t i n u e i f w e a r e . * /
2016-09-21 10:43:31 +03:00
# endif
2019-08-02 13:56:28 +03:00
/ *
* Check i f w e a r e c o m i n g f r o m u s e r s p a c e . I f y e s , t h e n r u n t h e n o r m a l
* exception h a n d l e r w h i c h w i l l d e l i v e r t h e M C e v e n t t o t h i s k e r n e l .
* /
andi. r11 ,r12 ,M S R _ P R / * S e e i f c o m i n g f r o m u s e r . * /
2019-08-02 13:56:41 +03:00
bne m c e _ d e l i v e r / * c o n t i n u e i n V m o d e i f w e a r e . * /
2019-08-02 13:56:28 +03:00
2016-09-21 10:43:31 +03:00
/ *
2019-08-02 13:56:28 +03:00
* At t h i s p o i n t w e a r e c o m i n g f r o m k e r n e l c o n t e x t .
2016-09-21 10:43:31 +03:00
* Queue u p t h e M C E e v e n t a n d r e t u r n f r o m t h e i n t e r r u p t .
* But b e f o r e t h a t , c h e c k i f t h i s i s a n u n - r e c o v e r a b l e e x c e p t i o n .
* If y e s , t h e n s t a y o n e m e r g e n c y s t a c k a n d p a n i c .
* /
andi. r11 ,r12 ,M S R _ R I
2019-08-02 13:56:39 +03:00
beq u n r e c o v e r a b l e _ m c e
2016-09-21 10:43:31 +03:00
/ *
* Check i f w e h a v e s u c c e s s f u l l y h a n d l e d / r e c o v e r e d f r o m e r r o r , i f n o t
* then s t a y o n e m e r g e n c y s t a c k a n d p a n i c .
* /
ld r3 ,R E S U L T ( r1 ) / * L o a d r e s u l t * /
cmpdi r3 ,0 / * s e e i f w e h a n d l e d M C E s u c c e s s f u l l y * /
2019-08-02 13:56:39 +03:00
beq u n r e c o v e r a b l e _ m c e / * i f ! h a n d l e d t h e n p a n i c * /
powerpc/64s/exception: machine check pseries should skip the late handler for kernel MCEs
The powernv machine check handler copes with taking a MCE from one of
three contexts, guest, kernel, and user. In each case the early
handler runs first on a special stack, then:
- The guest case branches to the KVM interrupt handler (via standard
interrupt macros).
- The user case will run the "late" handler which is like a normal
interrupt that runs in virtual mode and uses the regular kernel
stack.
- The kernel case queues the event and schedules it for processing
with irq work.
The last case is important, it must not enable virtual memory because
the MMU state may not be set up to deal with that (e.g., SLB might be
clear), it must not use the regular kernel stack for similar reasons
(e.g., might be in OPAL with OPAL stack in r1), and the kernel does
not expect anything to touch its stack if interrupts are disabled.
The pseries handler does not do this queueing, but instead it always
runs the late handler for host MCEs, which has some of the same
problems.
Now that pseries is using machine_check_events, change it to do the
same as powernv and queue events for kernel MCEs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-11-npiggin@gmail.com
2019-08-02 13:56:35 +03:00
2016-09-21 10:43:31 +03:00
/ *
* Return f r o m M C i n t e r r u p t .
* Queue u p t h e M C E e v e n t s o t h a t w e c a n l o g i t l a t e r , w h i l e
* returning f r o m k e r n e l o r o p a l c a l l .
* /
bl m a c h i n e _ c h e c k _ q u e u e _ e v e n t
MACHINE_ C H E C K _ H A N D L E R _ W I N D U P
2019-08-02 13:56:29 +03:00
RFI_ T O _ K E R N E L
powerpc/64s/exception: machine check pseries should skip the late handler for kernel MCEs
The powernv machine check handler copes with taking a MCE from one of
three contexts, guest, kernel, and user. In each case the early
handler runs first on a special stack, then:
- The guest case branches to the KVM interrupt handler (via standard
interrupt macros).
- The user case will run the "late" handler which is like a normal
interrupt that runs in virtual mode and uses the regular kernel
stack.
- The kernel case queues the event and schedules it for processing
with irq work.
The last case is important, it must not enable virtual memory because
the MMU state may not be set up to deal with that (e.g., SLB might be
clear), it must not use the regular kernel stack for similar reasons
(e.g., might be in OPAL with OPAL stack in r1), and the kernel does
not expect anything to touch its stack if interrupts are disabled.
The pseries handler does not do this queueing, but instead it always
runs the late handler for host MCEs, which has some of the same
problems.
Now that pseries is using machine_check_events, change it to do the
same as powernv and queue events for kernel MCEs.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20190802105709.27696-11-npiggin@gmail.com
2019-08-02 13:56:35 +03:00
2019-08-02 13:56:41 +03:00
mce_deliver :
/ *
* This i s a h o s t u s e r o r g u e s t M C E . R e s t o r e a l l r e g i s t e r s , t h e n
* run t h e " l a t e " h a n d l e r . F o r h o s t u s e r , t h i s w i l l r u n t h e
* machine_ c h e c k _ e x c e p t i o n h a n d l e r i n v i r t u a l m o d e l i k e a n o r m a l
* interrupt h a n d l e r . F o r g u e s t , t h i s w i l l t r i g g e r t h e K V M t e s t
* and b r a n c h t o t h e K V M i n t e r r u p t s i m i l a r l y t o o t h e r i n t e r r u p t s .
* /
2019-08-02 13:56:32 +03:00
BEGIN_ F T R _ S E C T I O N
ld r10 ,O R I G _ G P R 3 ( r1 )
mtspr S P R N _ C F A R ,r10
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ C F A R )
2016-09-21 10:43:31 +03:00
MACHINE_ C H E C K _ H A N D L E R _ W I N D U P
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y m a c h i n e _ c h e c k , v i r t =0
2016-09-21 10:43:31 +03:00
2019-08-02 13:56:40 +03:00
EXC_ C O M M O N _ B E G I N ( m a c h i n e _ c h e c k _ c o m m o n )
/ *
* Machine c h e c k i s d i f f e r e n t b e c a u s e w e u s e a d i f f e r e n t
* save a r e a : P A C A _ E X M C i n s t e a d o f P A C A _ E X G E N .
* /
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N m a c h i n e _ c h e c k
2019-08-02 13:56:40 +03:00
FINISH_ N A P
/* Enable MSR_RI when finished with PACA_EXMC */
li r10 ,M S R _ R I
mtmsrd r10 ,1
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl m a c h i n e _ c h e c k _ e x c e p t i o n
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2019-08-02 13:56:40 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M m a c h i n e _ c h e c k
2019-08-02 13:56:40 +03:00
# ifdef C O N F I G _ P P C _ P 7 _ N A P
/ *
* This i s a n i d l e w a k e u p . L o w l e v e l m a c h i n e c h e c k h a s a l r e a d y b e e n
* done. Q u e u e t h e e v e n t t h e n c a l l t h e i d l e c o d e t o d o t h e w a k e u p .
* /
EXC_ C O M M O N _ B E G I N ( m a c h i n e _ c h e c k _ i d l e _ c o m m o n )
bl m a c h i n e _ c h e c k _ q u e u e _ e v e n t
/ *
2020-05-08 07:33:53 +03:00
* GPR- l o s s w a k e u p s a r e r e l a t i v e l y s t r a i g h t f o r w a r d , b e c a u s e t h e
* idle s l e e p c o d e h a s s a v e d a l l n o n - v o l a t i l e r e g i s t e r s o n i t s
* own s t a c k , a n d r1 i n P A C A R 1 .
2019-08-02 13:56:40 +03:00
*
2020-05-08 07:33:53 +03:00
* For n o - l o s s w a k e u p s t h e r1 a n d l r r e g i s t e r s u s e d b y t h e
* early m a c h i n e c h e c k h a n d l e r h a v e t o b e r e s t o r e d f i r s t . r2 i s
* the k e r n e l T O C , s o n o n e e d t o r e s t o r e i t .
2019-08-02 13:56:40 +03:00
*
* Then d e c r e m e n t M C E n e s t i n g a f t e r f i n i s h i n g w i t h t h e s t a c k .
* /
ld r3 ,_ M S R ( r1 )
ld r4 ,_ L I N K ( r1 )
2020-05-08 07:33:53 +03:00
ld r1 ,G P R 1 ( r1 )
2019-08-02 13:56:40 +03:00
lhz r11 ,P A C A _ I N _ M C E ( r13 )
subi r11 ,r11 ,1
sth r11 ,P A C A _ I N _ M C E ( r13 )
mtlr r4
rlwinm r10 ,r3 ,4 7 - 3 1 ,3 0 ,3 1
cmpwi c r1 ,r10 ,2
2020-05-08 07:33:53 +03:00
bltlr c r1 / * n o s t a t e l o s s , r e t u r n t o i d l e c a l l e r w i t h r3 =SRR1 * /
2019-08-02 13:56:40 +03:00
b i d l e _ r e t u r n _ g p r _ l o s s
# endif
2019-08-02 13:56:39 +03:00
EXC_ C O M M O N _ B E G I N ( u n r e c o v e r a b l e _ m c e )
/ *
* We a r e g o i n g d o w n . B u t t h e r e a r e c h a n c e s t h a t w e m i g h t g e t h i t b y
* another M C E d u r i n g p a n i c p a t h a n d w e m a y r u n i n t o u n s t a b l e s t a t e
* with n o w a y o u t . H e n c e , t u r n M E b i t o f f w h i l e g o i n g d o w n , s o t h a t
* when a n o t h e r M C E i s h i t d u r i n g p a n i c p a t h , s y s t e m w i l l c h e c k s t o p
* and h y p e r v i s o r w i l l g e t r e s t a r t e d c l e a n l y b y S P .
* /
BEGIN_ F T R _ S E C T I O N
li r10 ,0 / * c l e a r M S R _ R I * /
mtmsrd r10 ,1
bl d i s a b l e _ m a c h i n e _ c h e c k
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ H V M O D E )
ld r10 ,P A C A K M S R ( r13 )
li r3 ,M S R _ M E
andc r10 ,r10 ,r3
mtmsrd r10
2020-05-08 07:33:54 +03:00
lhz r12 ,P A C A _ I N _ M C E ( r13 )
subi r12 ,r12 ,1
sth r12 ,P A C A _ I N _ M C E ( r13 )
2016-09-21 10:43:31 +03:00
/* Invoke machine_check_exception to print MCE event and panic. */
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl m a c h i n e _ c h e c k _ e x c e p t i o n
2019-08-02 13:56:39 +03:00
2016-09-21 10:43:31 +03:00
/ *
2019-08-02 13:56:39 +03:00
* We w i l l n o t r e a c h h e r e . E v e n i f w e d i d , t h e r e i s n o w a y o u t .
* Call u n r e c o v e r a b l e _ e x c e p t i o n a n d d i e .
2016-09-21 10:43:31 +03:00
* /
2019-08-02 13:56:39 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
2016-09-21 10:43:31 +03:00
bl u n r e c o v e r a b l e _ e x c e p t i o n
2019-08-02 13:56:39 +03:00
b .
2016-09-21 10:43:31 +03:00
2020-02-25 20:35:14 +03:00
/ * *
2020-02-25 20:35:28 +03:00
* Interrupt 0 x30 0 - D a t a S t o r a g e I n t e r r u p t ( D S I ) .
* This i s a s y n c h r o n o u s i n t e r r u p t g e n e r a t e d d u e t o a d a t a a c c e s s e x c e p t i o n ,
* e. g . , a l o a d o r s t o r e w h i c h d o e s n o t h a v e a v a l i d p a g e t a b l e e n t r y w i t h
* permissions. D A W R m a t c h e s a l s o f a u l t h e r e , a s d o R C u p d a t e s , a n d m i n o r m i s c
* errors e . g . , c o p y / p a s t e , A M O , c e r t a i n i n v a l i d C I a c c e s s e s , e t c .
*
* Handling :
* - Hash M M U
* Go t o d o _ h a s h _ p a g e f i r s t t o s e e i f t h e H P T c a n b e f i l l e d f r o m a n e n t r y i n
* the L i n u x p a g e t a b l e . H a s h f a u l t s c a n h i t i n k e r n e l m o d e i n a f a i r l y
* arbitrary s t a t e ( e . g . , i n t e r r u p t s d i s a b l e d , l o c k s h e l d ) w h e n a c c e s s i n g
* " non- b o l t e d " r e g i o n s , e . g . , v m a l l o c s p a c e . H o w e v e r t h e s e s h o u l d a l w a y s b e
* backed b y L i n u x p a g e t a b l e s .
2020-02-25 20:35:14 +03:00
*
2020-02-25 20:35:28 +03:00
* If n o n e i s f o u n d , d o a L i n u x p a g e f a u l t . L i n u x p a g e f a u l t s c a n h a p p e n i n
* kernel m o d e d u e t o u s e r c o p y o p e r a t i o n s o f c o u r s e .
2020-02-25 20:35:14 +03:00
*
2020-02-25 20:35:28 +03:00
* - Radix M M U
* The h a r d w a r e l o a d s f r o m t h e L i n u x p a g e t a b l e d i r e c t l y , s o a f a u l t g o e s
* immediately t o L i n u x p a g e f a u l t .
2020-02-25 20:35:14 +03:00
*
2020-02-25 20:35:28 +03:00
* Conditions l i k e D A W R m a t c h a r e h a n d l e d o n t h e w a y i n t o L i n u x p a g e f a u l t .
2020-02-25 20:35:14 +03:00
* /
2020-02-25 20:35:10 +03:00
INT_ D E F I N E _ B E G I N ( d a t a _ a c c e s s )
IVEC=0x300
IDAR=1
IDSISR=1
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:12 +03:00
IKVM_ S K I P =1
2020-02-25 20:35:10 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:10 +03:00
INT_ D E F I N E _ E N D ( d a t a _ a c c e s s )
2009-06-03 01:17:38 +04:00
2019-02-26 11:51:09 +03:00
EXC_ R E A L _ B E G I N ( d a t a _ a c c e s s , 0 x30 0 , 0 x80 )
2020-02-25 20:35:26 +03:00
GEN_ I N T _ E N T R Y d a t a _ a c c e s s , v i r t =0
2019-02-26 11:51:09 +03:00
EXC_ R E A L _ E N D ( d a t a _ a c c e s s , 0 x30 0 , 0 x80 )
EXC_ V I R T _ B E G I N ( d a t a _ a c c e s s , 0 x43 0 0 , 0 x80 )
2020-02-25 20:35:10 +03:00
GEN_ I N T _ E N T R Y d a t a _ a c c e s s , v i r t =1
2019-02-26 11:51:09 +03:00
EXC_ V I R T _ E N D ( d a t a _ a c c e s s , 0 x43 0 0 , 0 x80 )
2016-09-21 10:43:32 +03:00
EXC_ C O M M O N _ B E G I N ( d a t a _ a c c e s s _ c o m m o n )
2020-02-25 20:35:11 +03:00
GEN_ C O M M O N d a t a _ a c c e s s
2019-08-02 13:57:01 +03:00
ld r4 ,_ D A R ( r1 )
ld r5 ,_ D S I S R ( r1 )
2016-09-21 10:43:32 +03:00
BEGIN_ M M U _ F T R _ S E C T I O N
2019-08-02 13:57:01 +03:00
ld r6 ,_ M S R ( r1 )
li r3 ,0 x30 0
2016-09-21 10:43:32 +03:00
b d o _ h a s h _ p a g e / * T r y t o h a n d l e a s h p t e f a u l t * /
MMU_ F T R _ S E C T I O N _ E L S E
b h a n d l e _ p a g e _ f a u l t
ALT_ M M U _ F T R _ S E C T I O N _ E N D _ I F C L R ( M M U _ F T R _ T Y P E _ R A D I X )
2020-02-25 20:35:21 +03:00
GEN_ K V M d a t a _ a c c e s s
2009-06-03 01:17:38 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x38 0 - D a t a S e g m e n t I n t e r r u p t ( D S L B ) .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o a n M M U f a u l t m i s s i n g S L B
* entry f o r H P T , o r a n a d d r e s s o u t s i d e R P T t r a n s l a t i o n r a n g e .
*
* Handling :
* - HPT :
* This r e f i l l s t h e S L B , o r r e p o r t s a n a c c e s s f a u l t s i m i l a r l y t o a b a d p a g e
* fault. W h e n c o m i n g f r o m u s e r - m o d e , t h e S L B h a n d l e r m a y a c c e s s a n y k e r n e l
* data, t h o u g h i t m a y i t s e l f t a k e a D S L B . W h e n c o m i n g f r o m k e r n e l m o d e ,
* recursive f a u l t s m u s t b e a v o i d e d s o a c c e s s i s r e s t r i c t e d t o t h e k e r n e l
* image t e x t / d a t a , k e r n e l s t a c k , a n d a n y d a t a a l l o c a t e d b e l o w
* ppc6 4 _ b o l t e d _ s i z e ( f i r s t s e g m e n t ) . T h e k e r n e l h a n d l e r m u s t a v o i d s t o m p i n g
* on u s e r - h a n d l e r d a t a s t r u c t u r e s .
*
* A d e d i c a t e d s a v e a r e a E X S L B i s u s e d ( X X X : b u t i t a c t u a l l y n e e d n o t b e
* these d a y s , w e c o u l d u s e E X G E N ) .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( d a t a _ a c c e s s _ s l b )
IVEC=0x380
IAREA=PACA_EXSLB
IRECONCILE=0
IDAR=1
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ S K I P =1
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( d a t a _ a c c e s s _ s l b )
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ B E G I N ( d a t a _ a c c e s s _ s l b , 0 x38 0 , 0 x80 )
2020-02-25 20:35:26 +03:00
GEN_ I N T _ E N T R Y d a t a _ a c c e s s _ s l b , v i r t =0
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ E N D ( d a t a _ a c c e s s _ s l b , 0 x38 0 , 0 x80 )
EXC_ V I R T _ B E G I N ( d a t a _ a c c e s s _ s l b , 0 x43 8 0 , 0 x80 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y d a t a _ a c c e s s _ s l b , v i r t =1
2016-12-06 04:41:12 +03:00
EXC_ V I R T _ E N D ( d a t a _ a c c e s s _ s l b , 0 x43 8 0 , 0 x80 )
2018-09-14 18:30:51 +03:00
EXC_ C O M M O N _ B E G I N ( d a t a _ a c c e s s _ s l b _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N d a t a _ a c c e s s _ s l b
2019-08-02 13:56:57 +03:00
ld r4 ,_ D A R ( r1 )
2018-09-14 18:30:51 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
2019-03-29 10:42:57 +03:00
BEGIN_ M M U _ F T R _ S E C T I O N
/* HPT case, do SLB fault */
2018-09-14 18:30:51 +03:00
bl d o _ s l b _ f a u l t
cmpdi r3 ,0
bne- 1 f
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b f a s t _ i n t e r r u p t _ r e t u r n
2018-09-14 18:30:51 +03:00
1 : /* Error case */
2019-03-29 10:42:57 +03:00
MMU_ F T R _ S E C T I O N _ E L S E
/* Radix case, access is outside page table range */
li r3 ,- E F A U L T
ALT_ M M U _ F T R _ S E C T I O N _ E N D _ I F C L R ( M M U _ F T R _ T Y P E _ R A D I X )
2018-09-14 18:30:51 +03:00
std r3 ,R E S U L T ( r1 )
RECONCILE_ I R Q _ S T A T E ( r10 , r11 )
ld r4 ,_ D A R ( r1 )
ld r5 ,R E S U L T ( r1 )
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl d o _ b a d _ s l b _ f a u l t
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2018-09-14 18:30:51 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M d a t a _ a c c e s s _ s l b
2016-09-21 10:43:33 +03:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x40 0 - I n s t r u c t i o n S t o r a g e I n t e r r u p t ( I S I ) .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o a n M M U f a u l t d u e t o a n
* instruction f e t c h .
*
* Handling :
* Similar t o D S I , t h o u g h i n r e s p o n s e t o f e t c h . T h e f a u l t i n g a d d r e s s i s f o u n d
* in S R R 0 ( r a t h e r t h a n D A R ) , a n d s t a t u s i n S R R 1 ( r a t h e r t h a n D S I S R ) .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( i n s t r u c t i o n _ a c c e s s )
IVEC=0x400
2020-02-25 20:35:18 +03:00
IISIDE=1
IDAR=1
IDSISR=1
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( i n s t r u c t i o n _ a c c e s s )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( i n s t r u c t i o n _ a c c e s s , 0 x40 0 , 0 x80 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y i n s t r u c t i o n _ a c c e s s , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( i n s t r u c t i o n _ a c c e s s , 0 x40 0 , 0 x80 )
EXC_ V I R T _ B E G I N ( i n s t r u c t i o n _ a c c e s s , 0 x44 0 0 , 0 x80 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y i n s t r u c t i o n _ a c c e s s , v i r t =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( i n s t r u c t i o n _ a c c e s s , 0 x44 0 0 , 0 x80 )
2016-09-21 10:43:34 +03:00
EXC_ C O M M O N _ B E G I N ( i n s t r u c t i o n _ a c c e s s _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N i n s t r u c t i o n _ a c c e s s
2019-08-02 13:57:01 +03:00
ld r4 ,_ D A R ( r1 )
ld r5 ,_ D S I S R ( r1 )
2016-09-21 10:43:34 +03:00
BEGIN_ M M U _ F T R _ S E C T I O N
2019-08-02 13:57:01 +03:00
ld r6 ,_ M S R ( r1 )
li r3 ,0 x40 0
2016-09-21 10:43:34 +03:00
b d o _ h a s h _ p a g e / * T r y t o h a n d l e a s h p t e f a u l t * /
MMU_ F T R _ S E C T I O N _ E L S E
b h a n d l e _ p a g e _ f a u l t
ALT_ M M U _ F T R _ S E C T I O N _ E N D _ I F C L R ( M M U _ F T R _ T Y P E _ R A D I X )
2020-02-25 20:35:21 +03:00
GEN_ K V M i n s t r u c t i o n _ a c c e s s
2009-06-03 01:17:38 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x48 0 - I n s t r u c t i o n S e g m e n t I n t e r r u p t ( I S L B ) .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o a n M M U f a u l t d u e t o a n
* instruction f e t c h .
*
* Handling :
* Similar t o D S L B , t h o u g h i n r e s p o n s e t o f e t c h . T h e f a u l t i n g a d d r e s s i s f o u n d
* in S R R 0 ( r a t h e r t h a n D A R ) .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( i n s t r u c t i o n _ a c c e s s _ s l b )
IVEC=0x480
IAREA=PACA_EXSLB
IRECONCILE=0
2020-02-25 20:35:18 +03:00
IISIDE=1
IDAR=1
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( i n s t r u c t i o n _ a c c e s s _ s l b )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( i n s t r u c t i o n _ a c c e s s _ s l b , 0 x48 0 , 0 x80 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y i n s t r u c t i o n _ a c c e s s _ s l b , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( i n s t r u c t i o n _ a c c e s s _ s l b , 0 x48 0 , 0 x80 )
EXC_ V I R T _ B E G I N ( i n s t r u c t i o n _ a c c e s s _ s l b , 0 x44 8 0 , 0 x80 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y i n s t r u c t i o n _ a c c e s s _ s l b , v i r t =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( i n s t r u c t i o n _ a c c e s s _ s l b , 0 x44 8 0 , 0 x80 )
2018-09-14 18:30:51 +03:00
EXC_ C O M M O N _ B E G I N ( i n s t r u c t i o n _ a c c e s s _ s l b _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N i n s t r u c t i o n _ a c c e s s _ s l b
2019-08-02 13:56:57 +03:00
ld r4 ,_ D A R ( r1 )
2018-09-14 18:30:51 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
2019-03-29 10:42:57 +03:00
BEGIN_ M M U _ F T R _ S E C T I O N
/* HPT case, do SLB fault */
2018-09-14 18:30:51 +03:00
bl d o _ s l b _ f a u l t
cmpdi r3 ,0
bne- 1 f
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b f a s t _ i n t e r r u p t _ r e t u r n
2018-09-14 18:30:51 +03:00
1 : /* Error case */
2019-03-29 10:42:57 +03:00
MMU_ F T R _ S E C T I O N _ E L S E
/* Radix case, access is outside page table range */
li r3 ,- E F A U L T
ALT_ M M U _ F T R _ S E C T I O N _ E N D _ I F C L R ( M M U _ F T R _ T Y P E _ R A D I X )
2018-09-14 18:30:51 +03:00
std r3 ,R E S U L T ( r1 )
2016-09-21 10:43:35 +03:00
RECONCILE_ I R Q _ S T A T E ( r10 , r11 )
2019-08-02 13:56:57 +03:00
ld r4 ,_ D A R ( r1 )
2018-09-14 18:30:51 +03:00
ld r5 ,R E S U L T ( r1 )
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl d o _ b a d _ s l b _ f a u l t
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:35 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M i n s t r u c t i o n _ a c c e s s _ s l b
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x50 0 - E x t e r n a l I n t e r r u p t .
* This i s a n a s y n c h r o n o u s m a s k a b l e i n t e r r u p t i n r e s p o n s e t o a n " e x t e r n a l
* exception" f r o m t h e i n t e r r u p t c o n t r o l l e r o r h y p e r v i s o r ( e . g . , d e v i c e
* interrupt) . I t i s m a s k a b l e i n h a r d w a r e b y c l e a r i n g M S R [ E E ] , a n d
* soft- m a s k a b l e w i t h I R Q S _ D I S A B L E D m a s k ( i . e . , l o c a l _ i r q _ d i s a b l e ( ) ) .
*
* When r u n n i n g i n H V m o d e , L i n u x s e t s u p t h e L P C R [ L P E S ] b i t s u c h t h a t
* interrupts a r e d e l i v e r e d w i t h H S R R r e g i s t e r s , g u e s t s u s e S R R s , w h i c h
* reqiures I H S R R _ I F _ H V M O D E .
*
* On b a r e m e t a l P O W E R 9 a n d l a t e r , L i n u x s e t s t h e L P C R [ H V I C E ] b i t s u c h t h a t
* external i n t e r r u p t s a r e d e l i v e r e d a s H y p e r v i s o r V i r t u a l i z a t i o n I n t e r r u p t s
* rather t h a n E x t e r n a l I n t e r r u p t s .
*
* Handling :
* This c a l l s i n t o L i n u x I R Q h a n d l e r . N V G P R s a r e n o t s a v e d t o r e d u c e o v e r h e a d ,
* because r e g i s t e r s a t t h e t i m e o f t h e i n t e r r u p t a r e n o t s o i m p o r t a n t a s i t i s
* asynchronous.
*
* If s o f t m a s k e d , t h e m a s k e d h a n d l e r w i l l n o t e t h e p e n d i n g i n t e r r u p t f o r
* replay, a n d c l e a r M S R [ E E ] i n t h e i n t e r r u p t e d c o n t e x t .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( h a r d w a r e _ i n t e r r u p t )
IVEC=0x500
2020-02-25 20:35:27 +03:00
IHSRR_ I F _ H V M O D E =1
2020-02-25 20:35:14 +03:00
IMASK=IRQS_DISABLED
IKVM_ R E A L =1
IKVM_ V I R T =1
INT_ D E F I N E _ E N D ( h a r d w a r e _ i n t e r r u p t )
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ B E G I N ( h a r d w a r e _ i n t e r r u p t , 0 x50 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h a r d w a r e _ i n t e r r u p t , v i r t =0
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ E N D ( h a r d w a r e _ i n t e r r u p t , 0 x50 0 , 0 x10 0 )
EXC_ V I R T _ B E G I N ( h a r d w a r e _ i n t e r r u p t , 0 x45 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h a r d w a r e _ i n t e r r u p t , v i r t =1
2016-12-06 04:41:12 +03:00
EXC_ V I R T _ E N D ( h a r d w a r e _ i n t e r r u p t , 0 x45 0 0 , 0 x10 0 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( h a r d w a r e _ i n t e r r u p t _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N h a r d w a r e _ i n t e r r u p t
2020-02-25 20:35:13 +03:00
FINISH_ N A P
RUNLATCH_ O N
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl d o _ I R Q
2020-02-25 20:35:38 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:36 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M h a r d w a r e _ i n t e r r u p t
2016-09-21 10:43:36 +03:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x60 0 - A l i g n m e n t I n t e r r u p t
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o d a t a a l i g n m e n t f a u l t .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( a l i g n m e n t )
IVEC=0x600
IDAR=1
IDSISR=1
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( a l i g n m e n t )
2019-02-26 11:51:09 +03:00
EXC_ R E A L _ B E G I N ( a l i g n m e n t , 0 x60 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y a l i g n m e n t , v i r t =0
2019-02-26 11:51:09 +03:00
EXC_ R E A L _ E N D ( a l i g n m e n t , 0 x60 0 , 0 x10 0 )
EXC_ V I R T _ B E G I N ( a l i g n m e n t , 0 x46 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y a l i g n m e n t , v i r t =1
2019-02-26 11:51:09 +03:00
EXC_ V I R T _ E N D ( a l i g n m e n t , 0 x46 0 0 , 0 x10 0 )
2016-09-21 10:43:37 +03:00
EXC_ C O M M O N _ B E G I N ( a l i g n m e n t _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N a l i g n m e n t
2016-09-21 10:43:37 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl a l i g n m e n t _ e x c e p t i o n
2020-02-25 20:35:38 +03:00
REST_ N V G P R S ( r1 ) / * i n s t r u c t i o n e m u l a t i o n m a y c h a n g e G P R s * /
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:37 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M a l i g n m e n t
2016-09-30 12:43:18 +03:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x70 0 - P r o g r a m I n t e r r u p t ( p r o g r a m c h e c k ) .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o v a r i o u s i n s t r u c t i o n f a u l t s :
* traps, p r i v i l e g e e r r o r s , T M e r r o r s , f l o a t i n g p o i n t e x c e p t i o n s .
*
* Handling :
* This i n t e r r u p t m a y u s e t h e " e m e r g e n c y s t a c k " i n s o m e c a s e s w h e n b e i n g t a k e n
* from k e r n e l c o n t e x t , w h i c h c o m p l i c a t e s h a n d l i n g .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( p r o g r a m _ c h e c k )
IVEC=0x700
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( p r o g r a m _ c h e c k )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( p r o g r a m _ c h e c k , 0 x70 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y p r o g r a m _ c h e c k , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( p r o g r a m _ c h e c k , 0 x70 0 , 0 x10 0 )
EXC_ V I R T _ B E G I N ( p r o g r a m _ c h e c k , 0 x47 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y p r o g r a m _ c h e c k , v i r t =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( p r o g r a m _ c h e c k , 0 x47 0 0 , 0 x10 0 )
2016-09-21 10:43:38 +03:00
EXC_ C O M M O N _ B E G I N ( p r o g r a m _ c h e c k _ c o m m o n )
2020-02-25 20:35:19 +03:00
_ _ GEN_ C O M M O N _ E N T R Y p r o g r a m _ c h e c k
powerpc/64s: Use emergency stack for kernel TM Bad Thing program checks
When using transactional memory (TM), the CPU can be in one of six
states as far as TM is concerned, encoded in the Machine State
Register (MSR). Certain state transitions are illegal and if attempted
trigger a "TM Bad Thing" type program check exception.
If we ever hit one of these exceptions it's treated as a bug, ie. we
oops, and kill the process and/or panic, depending on configuration.
One case where we can trigger a TM Bad Thing, is when returning to
userspace after a system call or interrupt, using RFID. When this
happens the CPU first restores the user register state, in particular
r1 (the stack pointer) and then attempts to update the MSR. However
the MSR update is not allowed and so we take the program check with
the user register state, but the kernel MSR.
This tricks the exception entry code into thinking we have a bad
kernel stack pointer, because the MSR says we're coming from the
kernel, but r1 is pointing to userspace.
To avoid this we instead always switch to the emergency stack if we
take a TM Bad Thing from the kernel. That way none of the user
register values are used, other than for printing in the oops message.
This is the fix for CVE-2017-1000255.
Fixes: 5d176f751ee3 ("powerpc: tm: Enable transactional memory (TM) lazily for userspace")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
[mpe: Rewrite change log & comments, tweak asm slightly]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-17 13:42:26 +03:00
/ *
* It' s p o s s i b l e t o r e c e i v e a T M B a d T h i n g t y p e p r o g r a m c h e c k w i t h
* userspace r e g i s t e r v a l u e s ( i n p a r t i c u l a r r1 ) , b u t w i t h S R R 1 r e p o r t i n g
* that w e c a m e f r o m t h e k e r n e l . N o r m a l l y t h a t w o u l d c o n f u s e t h e b a d
* stack l o g i c , a n d w e w o u l d r e p o r t a b a d k e r n e l s t a c k p o i n t e r . I n s t e a d
* we s w i t c h t o t h e e m e r g e n c y s t a c k i f w e ' r e t a k i n g a T M B a d T h i n g f r o m
* the k e r n e l .
* /
powerpc/64s/exception: remove bad stack branch
The bad stack test in interrupt handlers has a few problems. For
performance it is taken in the common case, which is a fetch bubble
and a waste of i-cache.
For code development and maintainence, it requires yet another stack
frame setup routine, and that constrains all exception handlers to
follow the same register save pattern which inhibits future
optimisation.
Remove the test/branch and replace it with a trap. Teach the program
check handler to use the emergency stack for this case.
This does not result in quite so nice a message, however the SRR0 and
SRR1 of the crashed interrupt can be seen in r11 and r12, as is the
original r1 (adjusted by INT_FRAME_SIZE). These are the most important
parts to debugging the issue.
The original r9-12 and cr0 is lost, which is the main downside.
kernel BUG at linux/arch/powerpc/kernel/exceptions-64s.S:847!
Oops: Exception in kernel mode, sig: 5 [#1]
BE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted
NIP: c000000000009108 LR: c000000000cadbcc CTR: c0000000000090f0
REGS: c0000000fffcbd70 TRAP: 0700 Not tainted
MSR: 9000000000021032 <SF,HV,ME,IR,DR,RI> CR: 28222448 XER: 20040000
CFAR: c000000000009100 IRQMASK: 0
GPR00: 000000000000003d fffffffffffffd00 c0000000018cfb00 c0000000f02b3166
GPR04: fffffffffffffffd 0000000000000007 fffffffffffffffb 0000000000000030
GPR08: 0000000000000037 0000000028222448 0000000000000000 c000000000ca8de0
GPR12: 9000000002009032 c000000001ae0000 c000000000010a00 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: c0000000f00322c0 c000000000f85200 0000000000000004 ffffffffffffffff
GPR24: fffffffffffffffe 0000000000000000 0000000000000000 000000000000000a
GPR28: 0000000000000000 0000000000000000 c0000000f02b391c c0000000f02b3167
NIP [c000000000009108] decrementer_common+0x18/0x160
LR [c000000000cadbcc] .vsnprintf+0x3ec/0x4f0
Call Trace:
Instruction dump:
996d098a 994d098b 38610070 480246ed 48005518 60000000 38200000 718a4000
7c2a0b78 3821fd00 41c20008 e82d0970 <0981fd00> f92101a0 f9610170 f9810178
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-28 09:33:18 +03:00
andi. r10 ,r12 ,M S R _ P R
bne 2 f / * I f u s e r s p a c e , g o n o r m a l p a t h * /
andis. r10 ,r12 ,( S R R 1 _ P R O G T M ) @h
bne 1 f / * I f T M , e m e r g e n c y * /
cmpdi r1 ,- I N T _ F R A M E _ S I Z E / * c h e c k i f r1 i s i n u s e r s p a c e * /
blt 2 f / * n o r m a l p a t h i f n o t * /
/* Use the emergency stack */
1 : andi. r10 ,r12 ,M S R _ P R / * S e t C R 0 c o r r e c t l y f o r l a b e l * /
powerpc/64s: Use emergency stack for kernel TM Bad Thing program checks
When using transactional memory (TM), the CPU can be in one of six
states as far as TM is concerned, encoded in the Machine State
Register (MSR). Certain state transitions are illegal and if attempted
trigger a "TM Bad Thing" type program check exception.
If we ever hit one of these exceptions it's treated as a bug, ie. we
oops, and kill the process and/or panic, depending on configuration.
One case where we can trigger a TM Bad Thing, is when returning to
userspace after a system call or interrupt, using RFID. When this
happens the CPU first restores the user register state, in particular
r1 (the stack pointer) and then attempts to update the MSR. However
the MSR update is not allowed and so we take the program check with
the user register state, but the kernel MSR.
This tricks the exception entry code into thinking we have a bad
kernel stack pointer, because the MSR says we're coming from the
kernel, but r1 is pointing to userspace.
To avoid this we instead always switch to the emergency stack if we
take a TM Bad Thing from the kernel. That way none of the user
register values are used, other than for printing in the oops message.
This is the fix for CVE-2017-1000255.
Fixes: 5d176f751ee3 ("powerpc: tm: Enable transactional memory (TM) lazily for userspace")
Cc: stable@vger.kernel.org # v4.9+
Signed-off-by: Cyril Bur <cyrilbur@gmail.com>
[mpe: Rewrite change log & comments, tweak asm slightly]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2017-08-17 13:42:26 +03:00
/* 3 in EXCEPTION_PROLOG_COMMON */
mr r10 ,r1 / * S a v e r1 * /
ld r1 ,P A C A E M E R G S P ( r13 ) / * U s e e m e r g e n c y s t a c k * /
subi r1 ,r1 ,I N T _ F R A M E _ S I Z E / * a l l o c s t a c k f r a m e * /
2020-02-25 20:35:14 +03:00
_ _ ISTACK( p r o g r a m _ c h e c k ) =0
2020-02-25 20:35:19 +03:00
_ _ GEN_ C O M M O N _ B O D Y p r o g r a m _ c h e c k
2019-08-02 13:56:59 +03:00
b 3 f
powerpc/64s/exception: remove bad stack branch
The bad stack test in interrupt handlers has a few problems. For
performance it is taken in the common case, which is a fetch bubble
and a waste of i-cache.
For code development and maintainence, it requires yet another stack
frame setup routine, and that constrains all exception handlers to
follow the same register save pattern which inhibits future
optimisation.
Remove the test/branch and replace it with a trap. Teach the program
check handler to use the emergency stack for this case.
This does not result in quite so nice a message, however the SRR0 and
SRR1 of the crashed interrupt can be seen in r11 and r12, as is the
original r1 (adjusted by INT_FRAME_SIZE). These are the most important
parts to debugging the issue.
The original r9-12 and cr0 is lost, which is the main downside.
kernel BUG at linux/arch/powerpc/kernel/exceptions-64s.S:847!
Oops: Exception in kernel mode, sig: 5 [#1]
BE SMP NR_CPUS=2048 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1 Comm: swapper/0 Not tainted
NIP: c000000000009108 LR: c000000000cadbcc CTR: c0000000000090f0
REGS: c0000000fffcbd70 TRAP: 0700 Not tainted
MSR: 9000000000021032 <SF,HV,ME,IR,DR,RI> CR: 28222448 XER: 20040000
CFAR: c000000000009100 IRQMASK: 0
GPR00: 000000000000003d fffffffffffffd00 c0000000018cfb00 c0000000f02b3166
GPR04: fffffffffffffffd 0000000000000007 fffffffffffffffb 0000000000000030
GPR08: 0000000000000037 0000000028222448 0000000000000000 c000000000ca8de0
GPR12: 9000000002009032 c000000001ae0000 c000000000010a00 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: c0000000f00322c0 c000000000f85200 0000000000000004 ffffffffffffffff
GPR24: fffffffffffffffe 0000000000000000 0000000000000000 000000000000000a
GPR28: 0000000000000000 0000000000000000 c0000000f02b391c c0000000f02b3167
NIP [c000000000009108] decrementer_common+0x18/0x160
LR [c000000000cadbcc] .vsnprintf+0x3ec/0x4f0
Call Trace:
Instruction dump:
996d098a 994d098b 38610070 480246ed 48005518 60000000 38200000 718a4000
7c2a0b78 3821fd00 41c20008 e82d0970 <0981fd00> f92101a0 f9610170 f9810178
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-28 09:33:18 +03:00
2 :
2020-02-25 20:35:14 +03:00
_ _ ISTACK( p r o g r a m _ c h e c k ) =1
2020-02-25 20:35:19 +03:00
_ _ GEN_ C O M M O N _ B O D Y p r o g r a m _ c h e c k
2019-08-02 13:56:59 +03:00
3 :
2016-09-21 10:43:38 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl p r o g r a m _ c h e c k _ e x c e p t i o n
2020-02-25 20:35:38 +03:00
REST_ N V G P R S ( r1 ) / * i n s t r u c t i o n e m u l a t i o n m a y c h a n g e G P R s * /
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:38 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M p r o g r a m _ c h e c k
2011-06-29 04:18:26 +04:00
2020-02-25 20:35:28 +03:00
/ *
* Interrupt 0 x80 0 - F l o a t i n g - P o i n t U n a v a i l a b l e I n t e r r u p t .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o e x e c u t i n g a n f p i n s t r u c t i o n
* with M S R [ F P ] =0 .
*
* Handling :
* This w i l l l o a d F P r e g i s t e r s a n d e n a b l e t h e F P b i t i f c o m i n g f r o m u s e r s p a c e ,
* otherwise r e p o r t a b a d k e r n e l u s e o f F P .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( f p _ u n a v a i l a b l e )
IVEC=0x800
IRECONCILE=0
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( f p _ u n a v a i l a b l e )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( f p _ u n a v a i l a b l e , 0 x80 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y f p _ u n a v a i l a b l e , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( f p _ u n a v a i l a b l e , 0 x80 0 , 0 x10 0 )
EXC_ V I R T _ B E G I N ( f p _ u n a v a i l a b l e , 0 x48 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y f p _ u n a v a i l a b l e , v i r t =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( f p _ u n a v a i l a b l e , 0 x48 0 0 , 0 x10 0 )
2016-09-21 10:43:39 +03:00
EXC_ C O M M O N _ B E G I N ( f p _ u n a v a i l a b l e _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N f p _ u n a v a i l a b l e
2016-09-21 10:43:39 +03:00
bne 1 f / * i f f r o m u s e r , j u s t l o a d i t u p * /
RECONCILE_ I R Q _ S T A T E ( r10 , r11 )
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl k e r n e l _ f p _ u n a v a i l a b l e _ e x c e p t i o n
2019-08-26 14:10:23 +03:00
0 : trap
EMIT_ B U G _ E N T R Y 0 b , _ _ F I L E _ _ , _ _ L I N E _ _ , 0
2016-09-21 10:43:39 +03:00
1 :
# ifdef C O N F I G _ P P C _ T R A N S A C T I O N A L _ M E M
BEGIN_ F T R _ S E C T I O N
/ * Test i f 2 T M s t a t e b i t s a r e z e r o . I f n o n - z e r o ( i e . u s e r s p a c e w a s i n
* transaction) , g o d o T M s t u f f
* /
rldicl. r0 , r12 , ( 6 4 - M S R _ T S _ L G ) , ( 6 4 - 2 )
bne- 2 f
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ T M )
# endif
bl l o a d _ u p _ f p u
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b f a s t _ i n t e r r u p t _ r e t u r n
2016-09-21 10:43:39 +03:00
# ifdef C O N F I G _ P P C _ T R A N S A C T I O N A L _ M E M
2 : /* User process was in a transaction */
RECONCILE_ I R Q _ S T A T E ( r10 , r11 )
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl f p _ u n a v a i l a b l e _ t m
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:39 +03:00
# endif
2020-02-25 20:35:21 +03:00
GEN_ K V M f p _ u n a v a i l a b l e
2011-04-05 08:20:31 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x90 0 - D e c r e m e n t e r I n t e r r u p t .
* This i s a n a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o a d e c r e m e n t e r e x c e p t i o n
* ( e. g . , D E C h a s w r a p p e d b e l o w z e r o ) . I t i s m a s k a b l e i n h a r d w a r e b y c l e a r i n g
* MSR[ E E ] , a n d s o f t - m a s k a b l e w i t h I R Q S _ D I S A B L E D m a s k ( i . e . ,
* local_ i r q _ d i s a b l e ( ) ) .
*
* Handling :
* This c a l l s i n t o L i n u x t i m e r h a n d l e r . N V G P R s a r e n o t s a v e d ( s e e 0 x50 0 ) .
*
* If s o f t m a s k e d , t h e m a s k e d h a n d l e r w i l l n o t e t h e p e n d i n g i n t e r r u p t f o r
* replay, a n d b u m p t h e d e c r e m e n t e r t o a h i g h v a l u e , l e a v i n g M S R [ E E ] e n a b l e d
* in t h e i n t e r r u p t e d c o n t e x t .
* If P P C _ W A T C H D O G i s c o n f i g u r e d , t h e s o f t m a s k e d h a n d l e r w i l l a c t u a l l y s e t
* things b a c k u p t o r u n s o f t _ n m i _ i n t e r r u p t a s a r e g u l a r i n t e r r u p t h a n d l e r
* on t h e e m e r g e n c y s t a c k .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( d e c r e m e n t e r )
IVEC=0x900
IMASK=IRQS_DISABLED
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( d e c r e m e n t e r )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( d e c r e m e n t e r , 0 x90 0 , 0 x80 )
2020-02-25 20:35:26 +03:00
GEN_ I N T _ E N T R Y d e c r e m e n t e r , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( d e c r e m e n t e r , 0 x90 0 , 0 x80 )
EXC_ V I R T _ B E G I N ( d e c r e m e n t e r , 0 x49 0 0 , 0 x80 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y d e c r e m e n t e r , v i r t =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( d e c r e m e n t e r , 0 x49 0 0 , 0 x80 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( d e c r e m e n t e r _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N d e c r e m e n t e r
2020-02-25 20:35:13 +03:00
FINISH_ N A P
RUNLATCH_ O N
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl t i m e r _ i n t e r r u p t
2020-02-25 20:35:38 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:40 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M d e c r e m e n t e r
powerpc: Fix "attempt to move .org backwards" error
Building a 64-bit powerpc kernel with PR KVM enabled currently gives
this error:
AS arch/powerpc/kernel/head_64.o
arch/powerpc/kernel/exceptions-64s.S: Assembler messages:
arch/powerpc/kernel/exceptions-64s.S:258: Error: attempt to move .org backwards
make[2]: *** [arch/powerpc/kernel/head_64.o] Error 1
This happens because the MASKABLE_EXCEPTION_PSERIES macro turns into
33 instructions, but we only have space for 32 at the decrementer
interrupt vector (from 0x900 to 0x980).
In the code generated by the MASKABLE_EXCEPTION_PSERIES macro, we
currently have two instances of the HMT_MEDIUM macro, which has the
effect of setting the SMT thread priority to medium. One is the
first instruction, and is overwritten by a no-op on processors where
we save the PPR (processor priority register), that is, POWER7 or
later. The other is after we have saved the PPR.
In order to reduce the code at 0x900 by one instruction, we omit the
first HMT_MEDIUM. On processors without SMT this will have no effect
since HMT_MEDIUM is a no-op there. On POWER5 and RS64 machines this
will mean that the first few instructions take a little longer in the
case where a decrementer interrupt occurs when the hardware thread is
running at low SMT priority. On POWER6 and later machines, the
hardware automatically boosts the thread priority when a decrementer
interrupt is taken if the thread priority was below medium, so this
change won't make any difference.
The alternative would be to branch out of line after saving the CFAR.
However, that would incur an extra overhead on all processors, whereas
the approach adopted here only adds overhead on older threaded processors.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-04-25 21:51:40 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x98 0 - H y p e r v i s o r D e c r e m e n t e r I n t e r r u p t .
* This i s a n a s y n c h r o n o u s i n t e r r u p t , s i m i l a r t o 0 x90 0 b u t f o r t h e H D E C
* register.
*
* Handling :
* Linux d o e s n o t u s e t h i s o u t s i d e K V M w h e r e i t ' s u s e d t o k e e p a h o s t t i m e r
* while t h e g u e s t i s g i v e n c o n t r o l o f D E C . I t s h o u l d n o r m a l l y b e c a u g h t b y
* the K V M t e s t a n d r o u t e d t h e r e .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( h d e c r e m e n t e r )
IVEC=0x980
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:25 +03:00
ISTACK=0
IRECONCILE=0
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
IKVM_ V I R T =1
INT_ D E F I N E _ E N D ( h d e c r e m e n t e r )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( h d e c r e m e n t e r , 0 x98 0 , 0 x80 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h d e c r e m e n t e r , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( h d e c r e m e n t e r , 0 x98 0 , 0 x80 )
EXC_ V I R T _ B E G I N ( h d e c r e m e n t e r , 0 x49 8 0 , 0 x80 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h d e c r e m e n t e r , v i r t =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( h d e c r e m e n t e r , 0 x49 8 0 , 0 x80 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( h d e c r e m e n t e r _ c o m m o n )
2020-02-25 20:35:25 +03:00
_ _ GEN_ C O M M O N _ E N T R Y h d e c r e m e n t e r
/ *
* Hypervisor d e c r e m e n t e r i n t e r r u p t s n o t c a u g h t b y t h e K V M t e s t
* shouldn' t o c c u r b u t a r e s o m e t i m e s l e f t p e n d i n g o n e x i t f r o m a K V M
* guest. W e d o n ' t n e e d t o d o a n y t h i n g t o c l e a r t h e m , a s t h e y a r e
* edge- t r i g g e r e d .
*
* Be c a r e f u l t o a v o i d t o u c h i n g t h e k e r n e l s t a c k .
* /
ld r10 ,P A C A _ E X G E N + E X _ C T R ( r13 )
mtctr r10
mtcrf 0 x80 ,r9
ld r9 ,P A C A _ E X G E N + E X _ R 9 ( r13 )
ld r10 ,P A C A _ E X G E N + E X _ R 1 0 ( r13 )
ld r11 ,P A C A _ E X G E N + E X _ R 1 1 ( r13 )
ld r12 ,P A C A _ E X G E N + E X _ R 1 2 ( r13 )
ld r13 ,P A C A _ E X G E N + E X _ R 1 3 ( r13 )
HRFI_ T O _ K E R N E L
2016-09-21 10:43:41 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M h d e c r e m e n t e r
2011-04-05 08:20:31 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x a00 - D i r e c t e d P r i v i l e g e d D o o r b e l l I n t e r r u p t .
* This i s a n a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o a m s g s n d p d o o r b e l l .
* It i s m a s k a b l e i n h a r d w a r e b y c l e a r i n g M S R [ E E ] , a n d s o f t - m a s k a b l e w i t h
* IRQS_ D I S A B L E D m a s k ( i . e . , l o c a l _ i r q _ d i s a b l e ( ) ) .
*
* Handling :
* Guests m a y u s e t h i s f o r I P I s b e t w e e n t h r e a d s i n a c o r e i f t h e
* hypervisor s u p p o r t s i t . N V G P R S a r e n o t s a v e d ( s e e 0 x50 0 ) .
*
* If s o f t m a s k e d , t h e m a s k e d h a n d l e r w i l l n o t e t h e p e n d i n g i n t e r r u p t f o r
* replay, l e a v i n g M S R [ E E ] e n a b l e d i n t h e i n t e r r u p t e d c o n t e x t b e c a u s e t h e
* doorbells a r e e d g e t r i g g e r e d .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( d o o r b e l l _ s u p e r )
IVEC=0xa00
IMASK=IRQS_DISABLED
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( d o o r b e l l _ s u p e r )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( d o o r b e l l _ s u p e r , 0 x a00 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y d o o r b e l l _ s u p e r , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( d o o r b e l l _ s u p e r , 0 x a00 , 0 x10 0 )
EXC_ V I R T _ B E G I N ( d o o r b e l l _ s u p e r , 0 x4 a00 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y d o o r b e l l _ s u p e r , v i r t =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( d o o r b e l l _ s u p e r , 0 x4 a00 , 0 x10 0 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( d o o r b e l l _ s u p e r _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N d o o r b e l l _ s u p e r
2020-02-25 20:35:13 +03:00
FINISH_ N A P
RUNLATCH_ O N
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
2016-09-21 10:43:42 +03:00
# ifdef C O N F I G _ P P C _ D O O R B E L L
2020-02-25 20:35:13 +03:00
bl d o o r b e l l _ e x c e p t i o n
2016-09-21 10:43:42 +03:00
# else
2020-02-25 20:35:13 +03:00
bl u n k n o w n _ e x c e p t i o n
2016-09-21 10:43:42 +03:00
# endif
2020-02-25 20:35:38 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:42 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M d o o r b e l l _ s u p e r
2009-06-03 01:17:38 +04:00
2019-08-02 13:56:46 +03:00
EXC_ R E A L _ N O N E ( 0 x b00 , 0 x10 0 )
EXC_ V I R T _ N O N E ( 0 x4 b00 , 0 x10 0 )
2016-09-21 10:43:43 +03:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x c00 - S y s t e m C a l l I n t e r r u p t ( s y s c a l l , h c a l l ) .
* This i s a s y n c h r o n o u s i n t e r r u p t i n v o k e d w i t h t h e " s c " i n s t r u c t i o n . T h e
* system c a l l i s i n v o k e d w i t h " s c 0 " a n d d o e s n o t a l t e r t h e H V b i t , s o i t
* is d i r e c t e d t o t h e c u r r e n t l y r u n n i n g O S . T h e h y p e r c a l l i s i n v o k e d w i t h
* " sc 1 " a n d i t s e t s H V =1 , s o i t e l e v a t e s t o h y p e r v i s o r .
2017-06-08 18:35:04 +03:00
*
* In H P T , s c 1 a l w a y s g o e s t o 0 x c00 r e a l m o d e . I n R A D I X , s c 1 c a n g o t o
* 0 x4 c00 v i r t u a l m o d e .
*
2020-02-25 20:35:28 +03:00
* Handling :
* If t h e K V M t e s t f i r e s t h e n i t w a s d u e t o a h y p e r c a l l a n d i s a c c o r d i n g l y
* routed t o K V M . O t h e r w i s e t h i s e x e c u t e s a n o r m a l L i n u x s y s t e m c a l l .
*
2017-06-08 18:35:04 +03:00
* Call c o n v e n t i o n :
*
2019-08-28 11:27:29 +03:00
* syscall a n d h y p e r c a l l s r e g i s t e r c o n v e n t i o n s a r e d o c u m e n t e d i n
* Documentation/ p o w e r p c / s y s c a l l 6 4 - a b i . r s t a n d
* Documentation/ p o w e r p c / p a p r _ h c a l l s . r s t r e s p e c t i v e l y .
2017-06-08 18:35:04 +03:00
*
* The i n t e r s e c t i o n o f v o l a t i l e r e g i s t e r s t h a t d o n ' t c o n t a i n p o s s i b l e
2017-07-18 08:32:44 +03:00
* inputs i s : c r0 , x e r , c t r . W e m a y u s e t h e s e a s s c r a t c h r e g s u p o n e n t r y
* without s a v i n g , t h o u g h x e r i s n o t a g o o d i d e a t o u s e , a s h a r d w a r e m a y
* interpret s o m e b i t s s o i t m a y b e c o s t l y t o c h a n g e t h e m .
2017-06-08 18:35:04 +03:00
* /
2020-02-25 20:35:17 +03:00
INT_ D E F I N E _ B E G I N ( s y s t e m _ c a l l )
IVEC=0xc00
IKVM_ R E A L =1
IKVM_ V I R T =1
INT_ D E F I N E _ E N D ( s y s t e m _ c a l l )
2019-06-22 16:15:31 +03:00
.macro SYSTEM_CALL virt
2017-01-30 13:21:40 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ 6 4 _ H A N D L E R
2017-06-08 18:35:04 +03:00
/ *
* There i s a l i t t l e b i t o f j u g g l i n g t o g e t s y s c a l l a n d h c a l l
2017-07-18 08:32:44 +03:00
* working w e l l . S a v e r13 i n c t r t o a v o i d u s i n g S P R G s c r a t c h
* register.
2017-06-08 18:35:04 +03:00
*
* Userspace s y s c a l l s h a v e a l r e a d y s a v e d t h e P P R , h c a l l s m u s t s a v e
* it b e f o r e s e t t i n g H M T _ M E D I U M .
* /
2019-06-22 16:15:31 +03:00
mtctr r13
GET_ P A C A ( r13 )
std r10 ,P A C A _ E X G E N + E X _ R 1 0 ( r13 )
INTERRUPT_ T O _ K E R N E L
2020-02-25 20:35:24 +03:00
KVMTEST s y s t e m _ c a l l / * u s e s r10 , b r a n c h t o s y s t e m _ c a l l _ k v m * /
2019-06-22 16:15:31 +03:00
mfctr r9
2017-01-30 13:21:40 +03:00
# else
2019-06-22 16:15:31 +03:00
mr r9 ,r13
GET_ P A C A ( r13 )
INTERRUPT_ T O _ K E R N E L
2017-01-30 13:21:40 +03:00
# endif
2016-09-21 10:43:44 +03:00
2017-10-09 13:54:05 +03:00
# ifdef C O N F I G _ P P C _ F A S T _ E N D I A N _ S W I T C H
2019-06-22 16:15:31 +03:00
BEGIN_ F T R _ S E C T I O N
cmpdi r0 ,0 x1 e b e
beq- 1 f
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ R E A L _ L E )
# endif
2016-09-21 10:43:44 +03:00
2019-06-28 08:33:20 +03:00
/* We reach here with PACA in r13, r13 in r9. */
2019-06-22 16:15:31 +03:00
mfspr r11 ,S P R N _ S R R 0
mfspr r12 ,S P R N _ S R R 1
2019-06-28 08:33:20 +03:00
HMT_ M E D I U M
.if ! \ virt
2019-06-22 16:15:31 +03:00
_ _ LOAD_ H A N D L E R ( r10 , s y s t e m _ c a l l _ c o m m o n )
mtspr S P R N _ S R R 0 ,r10
ld r10 ,P A C A K M S R ( r13 )
mtspr S P R N _ S R R 1 ,r10
RFI_ T O _ K E R N E L
b . / * p r e v e n t s p e c u l a t i v e e x e c u t i o n * /
.else
2019-06-28 08:33:20 +03:00
li r10 ,M S R _ R I
mtmsrd r10 ,1 / * S e t R I ( E E =0 ) * /
2019-06-22 16:15:31 +03:00
# ifdef C O N F I G _ R E L O C A T A B L E
_ _ LOAD_ H A N D L E R ( r10 , s y s t e m _ c a l l _ c o m m o n )
mtctr r10
bctr
2016-09-21 10:43:44 +03:00
# else
2019-06-22 16:15:31 +03:00
b s y s t e m _ c a l l _ c o m m o n
# endif
.endif
# ifdef C O N F I G _ P P C _ F A S T _ E N D I A N _ S W I T C H
/* Fast LE/BE switch system call */
1 : mfspr r12 ,S P R N _ S R R 1
xori r12 ,r12 ,M S R _ L E
mtspr S P R N _ S R R 1 ,r12
mr r13 ,r9
RFI_ T O _ U S E R / * r e t u r n t o u s e r s p a c e * /
b . / * p r e v e n t s p e c u l a t i v e e x e c u t i o n * /
2016-09-21 10:43:44 +03:00
# endif
2019-06-22 16:15:31 +03:00
.endm
2016-09-21 10:43:44 +03:00
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ B E G I N ( s y s t e m _ c a l l , 0 x c00 , 0 x10 0 )
2019-06-22 16:15:31 +03:00
SYSTEM_ C A L L 0
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ E N D ( s y s t e m _ c a l l , 0 x c00 , 0 x10 0 )
EXC_ V I R T _ B E G I N ( s y s t e m _ c a l l , 0 x4 c00 , 0 x10 0 )
2019-06-22 16:15:31 +03:00
SYSTEM_ C A L L 1
2016-12-06 04:41:12 +03:00
EXC_ V I R T _ E N D ( s y s t e m _ c a l l , 0 x4 c00 , 0 x10 0 )
2016-09-21 10:43:44 +03:00
2017-06-08 18:35:04 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ 6 4 _ H A N D L E R
2020-02-25 20:35:21 +03:00
TRAMP_ R E A L _ B E G I N ( s y s t e m _ c a l l _ k v m )
2017-06-08 18:35:04 +03:00
/ *
* This i s a h c a l l , s o r e g i s t e r c o n v e n t i o n i s a s a b o v e , w i t h t h e s e
* differences :
* r1 3 = P A C A
2017-07-18 08:32:44 +03:00
* ctr = o r i g r13
* orig r10 s a v e d i n P A C A
2017-06-08 18:35:04 +03:00
* /
/ *
* Save t h e P P R ( o n s y s t e m s t h a t s u p p o r t i t ) b e f o r e c h a n g i n g t o
* HMT_ M E D I U M . T h a t a l l o w s t h e K V M c o d e t o s a v e t h a t v a l u e i n t o t h e
* guest s t a t e ( i t i s t h e g u e s t ' s P P R v a l u e ) .
* /
2020-02-25 20:35:23 +03:00
BEGIN_ F T R _ S E C T I O N
2020-02-25 20:35:21 +03:00
mfspr r10 ,S P R N _ P P R
std r10 ,H S T A T E _ P P R ( r13 )
2020-02-25 20:35:23 +03:00
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ H A S _ P P R )
2017-06-08 18:35:04 +03:00
HMT_ M E D I U M
mfctr r10
2017-07-18 08:32:44 +03:00
SET_ S C R A T C H 0 ( r10 )
2020-02-25 20:35:21 +03:00
mfcr r10
std r12 ,H S T A T E _ S C R A T C H 0 ( r13 )
sldi r12 ,r10 ,3 2
ori r12 ,r12 ,0 x c00
# ifdef C O N F I G _ R E L O C A T A B L E
/ *
* Requires _ _ L O A D _ F A R _ H A N D L E R b e a u s e k v m p p c _ i n t e r r u p t l i v e s
* outside t h e h e a d s e c t i o n .
* /
_ _ LOAD_ F A R _ H A N D L E R ( r10 , k v m p p c _ i n t e r r u p t )
mtctr r10
ld r10 ,P A C A _ E X G E N + E X _ R 1 0 ( r13 )
bctr
# else
ld r10 ,P A C A _ E X G E N + E X _ R 1 0 ( r13 )
b k v m p p c _ i n t e r r u p t
# endif
2017-06-08 18:35:04 +03:00
# endif
2016-09-30 12:43:18 +03:00
2016-09-21 10:43:44 +03:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x d00 - T r a c e I n t e r r u p t .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o i n s t r u c t i o n s t e p o r
* breakpoint f a u l t s .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( s i n g l e _ s t e p )
IVEC=0xd00
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( s i n g l e _ s t e p )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( s i n g l e _ s t e p , 0 x d00 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y s i n g l e _ s t e p , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( s i n g l e _ s t e p , 0 x d00 , 0 x10 0 )
EXC_ V I R T _ B E G I N ( s i n g l e _ s t e p , 0 x4 d00 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y s i n g l e _ s t e p , v i r t =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( s i n g l e _ s t e p , 0 x4 d00 , 0 x10 0 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( s i n g l e _ s t e p _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N s i n g l e _ s t e p
2020-02-25 20:35:13 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl s i n g l e _ s t e p _ e x c e p t i o n
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2011-06-29 04:18:26 +04:00
2020-02-25 20:35:21 +03:00
GEN_ K V M s i n g l e _ s t e p
2019-08-02 13:56:47 +03:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x e 0 0 - H y p e r v i s o r D a t a S t o r a g e I n t e r r u p t ( H D S I ) .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o a n M M U f a u l t c a u s e d b y a
* guest d a t a a c c e s s .
*
* Handling :
* This s h o u l d a l w a y s g e t r o u t e d t o K V M . I n r a d i x M M U m o d e , t h i s i s c a u s e d
* by a g u e s t n e s t e d r a d i x a c c e s s t h a t c a n ' t b e p e r f o r m e d d u e t o t h e
* partition s c o p e p a g e t a b l e . I n h a s h m o d e , t h i s c a n b e c a u s e d b y g u e s t s
* running w i t h t r a n s l a t i o n d i s a b l e d ( v i r t u a l r e a l m o d e ) o r w i t h V P M e n a b l e d .
* KVM w i l l u p d a t e t h e p a g e t a b l e s t r u c t u r e s o r d i s a l l o w t h e a c c e s s .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( h _ d a t a _ s t o r a g e )
IVEC=0xe00
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:14 +03:00
IDAR=1
IDSISR=1
IKVM_ S K I P =1
IKVM_ R E A L =1
IKVM_ V I R T =1
INT_ D E F I N E _ E N D ( h _ d a t a _ s t o r a g e )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( h _ d a t a _ s t o r a g e , 0 x e 0 0 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h _ d a t a _ s t o r a g e , v i r t =0 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( h _ d a t a _ s t o r a g e , 0 x e 0 0 , 0 x20 )
EXC_ V I R T _ B E G I N ( h _ d a t a _ s t o r a g e , 0 x4 e 0 0 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h _ d a t a _ s t o r a g e , v i r t =1 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( h _ d a t a _ s t o r a g e , 0 x4 e 0 0 , 0 x20 )
2016-09-21 10:43:46 +03:00
EXC_ C O M M O N _ B E G I N ( h _ d a t a _ s t o r a g e _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N h _ d a t a _ s t o r a g e
2016-09-21 10:43:46 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
2018-12-14 08:29:05 +03:00
BEGIN_ M M U _ F T R _ S E C T I O N
2019-08-02 13:56:57 +03:00
ld r4 ,_ D A R ( r1 )
2018-12-14 08:29:05 +03:00
li r5 ,S I G S E G V
bl b a d _ p a g e _ f a u l t
MMU_ F T R _ S E C T I O N _ E L S E
2016-09-21 10:43:46 +03:00
bl u n k n o w n _ e x c e p t i o n
2018-12-14 08:29:05 +03:00
ALT_ M M U _ F T R _ S E C T I O N _ E N D _ I F S E T ( M M U _ F T R _ T Y P E _ R A D I X )
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:46 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M h _ d a t a _ s t o r a g e
powerpc: Save CFAR before branching in interrupt entry paths
Some of the interrupt vectors on 64-bit POWER server processors are
only 32 bytes long, which is not enough for the full first-level
interrupt handler. For these we currently just have a branch to an
out-of-line handler. However, this means that we corrupt the CFAR
(come-from address register) on POWER7 and later processors.
To fix this, we split the EXCEPTION_PROLOG_1 macro into two pieces:
EXCEPTION_PROLOG_0 contains the part up to the point where the CFAR
is saved in the PACA, and EXCEPTION_PROLOG_1 contains the rest. We
then put EXCEPTION_PROLOG_0 in the short interrupt vectors before
we branch to the out-of-line handler, which contains the rest of the
first-level interrupt handler. To facilitate this, we define new
_OOL (out of line) variants of STD_EXCEPTION_PSERIES, etc.
In order to get EXCEPTION_PROLOG_0 to be short enough, i.e., no more
than 6 instructions, it was necessary to move the stores that move
the PPR and CFAR values into the PACA into __EXCEPTION_PROLOG_1 and
to get rid of one of the two HMT_MEDIUM instructions. Previously
there was a HMT_MEDIUM_PPR_DISCARD before the prolog, which was
nop'd out on processors with the PPR (POWER7 and later), and then
another HMT_MEDIUM inside the HMT_MEDIUM_PPR_SAVE macro call inside
__EXCEPTION_PROLOG_1, which was nop'd out on processors without PPR.
Now the HMT_MEDIUM inside EXCEPTION_PROLOG_0 is there unconditionally
and the HMT_MEDIUM_PPR_DISCARD is not strictly necessary, although
this leaves it in for the interrupt vectors where there is room for
it.
Previously we had a handler for hypervisor maintenance interrupts at
0xe50, which doesn't leave enough room for the vector for hypervisor
emulation assist interrupts at 0xe40, since we need 8 instructions.
The 0xe50 vector was only used on POWER6, as the HMI vector was moved
to 0xe60 on POWER7. Since we don't support running in hypervisor mode
on POWER6, we just remove the handler at 0xe50.
This also changes denorm_exception_hv to use EXCEPTION_PROLOG_0
instead of open-coding it, and removes the HMT_MEDIUM_PPR_DISCARD
from the relocation-on vectors (since any CPU that supports
relocation-on interrupts also has the PPR).
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-04 22:10:15 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x e 2 0 - H y p e r v i s o r I n s t r u c t i o n S t o r a g e I n t e r r u p t ( H I S I ) .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o a n M M U f a u l t c a u s e d b y a
* guest i n s t r u c t i o n f e t c h , s i m i l a r t o H D S I .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( h _ i n s t r _ s t o r a g e )
IVEC=0xe20
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
IKVM_ V I R T =1
INT_ D E F I N E _ E N D ( h _ i n s t r _ s t o r a g e )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( h _ i n s t r _ s t o r a g e , 0 x e 2 0 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h _ i n s t r _ s t o r a g e , v i r t =0 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( h _ i n s t r _ s t o r a g e , 0 x e 2 0 , 0 x20 )
EXC_ V I R T _ B E G I N ( h _ i n s t r _ s t o r a g e , 0 x4 e 2 0 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h _ i n s t r _ s t o r a g e , v i r t =1 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( h _ i n s t r _ s t o r a g e , 0 x4 e 2 0 , 0 x20 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( h _ i n s t r _ s t o r a g e _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N h _ i n s t r _ s t o r a g e
2020-02-25 20:35:13 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl u n k n o w n _ e x c e p t i o n
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:47 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M h _ i n s t r _ s t o r a g e
powerpc: Save CFAR before branching in interrupt entry paths
Some of the interrupt vectors on 64-bit POWER server processors are
only 32 bytes long, which is not enough for the full first-level
interrupt handler. For these we currently just have a branch to an
out-of-line handler. However, this means that we corrupt the CFAR
(come-from address register) on POWER7 and later processors.
To fix this, we split the EXCEPTION_PROLOG_1 macro into two pieces:
EXCEPTION_PROLOG_0 contains the part up to the point where the CFAR
is saved in the PACA, and EXCEPTION_PROLOG_1 contains the rest. We
then put EXCEPTION_PROLOG_0 in the short interrupt vectors before
we branch to the out-of-line handler, which contains the rest of the
first-level interrupt handler. To facilitate this, we define new
_OOL (out of line) variants of STD_EXCEPTION_PSERIES, etc.
In order to get EXCEPTION_PROLOG_0 to be short enough, i.e., no more
than 6 instructions, it was necessary to move the stores that move
the PPR and CFAR values into the PACA into __EXCEPTION_PROLOG_1 and
to get rid of one of the two HMT_MEDIUM instructions. Previously
there was a HMT_MEDIUM_PPR_DISCARD before the prolog, which was
nop'd out on processors with the PPR (POWER7 and later), and then
another HMT_MEDIUM inside the HMT_MEDIUM_PPR_SAVE macro call inside
__EXCEPTION_PROLOG_1, which was nop'd out on processors without PPR.
Now the HMT_MEDIUM inside EXCEPTION_PROLOG_0 is there unconditionally
and the HMT_MEDIUM_PPR_DISCARD is not strictly necessary, although
this leaves it in for the interrupt vectors where there is room for
it.
Previously we had a handler for hypervisor maintenance interrupts at
0xe50, which doesn't leave enough room for the vector for hypervisor
emulation assist interrupts at 0xe40, since we need 8 instructions.
The 0xe50 vector was only used on POWER6, as the HMI vector was moved
to 0xe60 on POWER7. Since we don't support running in hypervisor mode
on POWER6, we just remove the handler at 0xe50.
This also changes denorm_exception_hv to use EXCEPTION_PROLOG_0
instead of open-coding it, and removes the HMT_MEDIUM_PPR_DISCARD
from the relocation-on vectors (since any CPU that supports
relocation-on interrupts also has the PPR).
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-04 22:10:15 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x e 4 0 - H y p e r v i s o r E m u l a t i o n A s s i s t a n c e I n t e r r u p t .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( e m u l a t i o n _ a s s i s t )
IVEC=0xe40
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
IKVM_ V I R T =1
INT_ D E F I N E _ E N D ( e m u l a t i o n _ a s s i s t )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( e m u l a t i o n _ a s s i s t , 0 x e 4 0 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y e m u l a t i o n _ a s s i s t , v i r t =0 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( e m u l a t i o n _ a s s i s t , 0 x e 4 0 , 0 x20 )
EXC_ V I R T _ B E G I N ( e m u l a t i o n _ a s s i s t , 0 x4 e 4 0 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y e m u l a t i o n _ a s s i s t , v i r t =1 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( e m u l a t i o n _ a s s i s t , 0 x4 e 4 0 , 0 x20 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( e m u l a t i o n _ a s s i s t _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N e m u l a t i o n _ a s s i s t
2020-02-25 20:35:13 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl e m u l a t i o n _ a s s i s t _ i n t e r r u p t
2020-02-25 20:35:38 +03:00
REST_ N V G P R S ( r1 ) / * i n s t r u c t i o n e m u l a t i o n m a y c h a n g e G P R s * /
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:48 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M e m u l a t i o n _ a s s i s t
powerpc: Save CFAR before branching in interrupt entry paths
Some of the interrupt vectors on 64-bit POWER server processors are
only 32 bytes long, which is not enough for the full first-level
interrupt handler. For these we currently just have a branch to an
out-of-line handler. However, this means that we corrupt the CFAR
(come-from address register) on POWER7 and later processors.
To fix this, we split the EXCEPTION_PROLOG_1 macro into two pieces:
EXCEPTION_PROLOG_0 contains the part up to the point where the CFAR
is saved in the PACA, and EXCEPTION_PROLOG_1 contains the rest. We
then put EXCEPTION_PROLOG_0 in the short interrupt vectors before
we branch to the out-of-line handler, which contains the rest of the
first-level interrupt handler. To facilitate this, we define new
_OOL (out of line) variants of STD_EXCEPTION_PSERIES, etc.
In order to get EXCEPTION_PROLOG_0 to be short enough, i.e., no more
than 6 instructions, it was necessary to move the stores that move
the PPR and CFAR values into the PACA into __EXCEPTION_PROLOG_1 and
to get rid of one of the two HMT_MEDIUM instructions. Previously
there was a HMT_MEDIUM_PPR_DISCARD before the prolog, which was
nop'd out on processors with the PPR (POWER7 and later), and then
another HMT_MEDIUM inside the HMT_MEDIUM_PPR_SAVE macro call inside
__EXCEPTION_PROLOG_1, which was nop'd out on processors without PPR.
Now the HMT_MEDIUM inside EXCEPTION_PROLOG_0 is there unconditionally
and the HMT_MEDIUM_PPR_DISCARD is not strictly necessary, although
this leaves it in for the interrupt vectors where there is room for
it.
Previously we had a handler for hypervisor maintenance interrupts at
0xe50, which doesn't leave enough room for the vector for hypervisor
emulation assist interrupts at 0xe40, since we need 8 instructions.
The 0xe50 vector was only used on POWER6, as the HMI vector was moved
to 0xe60 on POWER7. Since we don't support running in hypervisor mode
on POWER6, we just remove the handler at 0xe50.
This also changes denorm_exception_hv to use EXCEPTION_PROLOG_0
instead of open-coding it, and removes the HMT_MEDIUM_PPR_DISCARD
from the relocation-on vectors (since any CPU that supports
relocation-on interrupts also has the PPR).
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-04 22:10:15 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x e 6 0 - H y p e r v i s o r M a i n t e n a n c e I n t e r r u p t ( H M I ) .
* This i s a n a s y n c h r o n o u s i n t e r r u p t c a u s e d b y a H y p e r v i s o r M a i n t e n a n c e
* Exception. I t i s a l w a y s t a k e n i n r e a l m o d e b u t u s e s H S R R r e g i s t e r s
* unlike S R E S E T a n d M C E .
*
* It i s m a s k a b l e i n h a r d w a r e b y c l e a r i n g M S R [ E E ] , a n d p a r t i a l l y s o f t - m a s k a b l e
* with I R Q S _ D I S A B L E D m a s k ( i . e . , l o c a l _ i r q _ d i s a b l e ( ) ) .
*
* Handling :
* This i s a s p e c i a l c a s e , t h i s i s h a n d l e d s i m i l a r l y t o m a c h i n e c h e c k s , w i t h a n
* initial r e a l m o d e h a n d l e r t h a t i s n o t s o f t - m a s k e d , w h i c h a t t e m p t s t o f i x t h e
* problem. T h e n a r e g u l a r h a n d l e r w h i c h i s s o f t - m a s k a b l e a n d r e p o r t s t h e
* problem.
*
* The e m e r g e n c y s t a c k i s u s e d f o r t h e e a r l y r e a l m o d e h a n d l e r .
*
* XXX : unclear w h y M C E a n d H M I s c h e m e s c o u l d n o t b e m a d e c o m m o n , e . g . ,
* either u s e s o f t - m a s k i n g f o r t h e M C E , o r u s e i r q _ w o r k f o r t h e H M I .
*
* KVM :
* Unlike M C E , t h i s c a l l s i n t o K V M w i t h o u t c a l l i n g t h e r e a l m o d e h a n d l e r
* first.
2016-09-21 10:44:07 +03:00
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( h m i _ e x c e p t i o n _ e a r l y )
IVEC=0xe60
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:22 +03:00
IREALMODE_ C O M M O N =1
2020-02-25 20:35:14 +03:00
ISTACK=0
IRECONCILE=0
IKUAP=0 / * W e d o n ' t t o u c h A M R h e r e , w e n e v e r g o t o v i r t u a l m o d e * /
IKVM_ R E A L =1
INT_ D E F I N E _ E N D ( h m i _ e x c e p t i o n _ e a r l y )
INT_ D E F I N E _ B E G I N ( h m i _ e x c e p t i o n )
IVEC=0xe60
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:14 +03:00
IMASK=IRQS_DISABLED
IKVM_ R E A L =1
INT_ D E F I N E _ E N D ( h m i _ e x c e p t i o n )
2019-06-28 09:33:21 +03:00
EXC_ R E A L _ B E G I N ( h m i _ e x c e p t i o n , 0 x e 6 0 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h m i _ e x c e p t i o n _ e a r l y , v i r t =0 , o o l =1
2019-06-28 09:33:21 +03:00
EXC_ R E A L _ E N D ( h m i _ e x c e p t i o n , 0 x e 6 0 , 0 x20 )
2016-12-06 04:41:12 +03:00
EXC_ V I R T _ N O N E ( 0 x4 e 6 0 , 0 x20 )
2020-02-25 20:35:14 +03:00
2019-06-28 09:33:22 +03:00
EXC_ C O M M O N _ B E G I N ( h m i _ e x c e p t i o n _ e a r l y _ c o m m o n )
2020-02-25 20:35:21 +03:00
_ _ GEN_ R E A L M O D E _ C O M M O N _ E N T R Y h m i _ e x c e p t i o n _ e a r l y
powerpc/64s: Exception macro for stack frame and initial register save
This code is common to a few exceptions, and another user will be added.
This causes a trivial change to generated code:
- 604: std r9,416(r1)
- 608: mfspr r11,314
- 60c: std r11,368(r1)
- 610: mfspr r12,315
+ 604: mfspr r11,314
+ 608: mfspr r12,315
+ 60c: std r9,416(r1)
+ 610: std r11,368(r1)
machine_check_powernv_early could also use this, but that requires non
trivial changes to generated code, so that's for another patch.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-12-19 21:30:03 +03:00
mr r10 ,r1 / * S a v e r1 * /
ld r1 ,P A C A E M E R G S P ( r13 ) / * U s e e m e r g e n c y s t a c k f o r r e a l m o d e * /
2016-09-21 10:43:49 +03:00
subi r1 ,r1 ,I N T _ F R A M E _ S I Z E / * a l l o c s t a c k f r a m e * /
2019-08-02 13:56:54 +03:00
2020-02-25 20:35:19 +03:00
_ _ GEN_ C O M M O N _ B O D Y h m i _ e x c e p t i o n _ e a r l y
2019-08-02 13:56:54 +03:00
2016-09-21 10:43:49 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
2019-06-28 09:33:22 +03:00
bl h m i _ e x c e p t i o n _ r e a l m o d e
2017-09-15 08:25:48 +03:00
cmpdi c r0 ,r3 ,0
2019-06-28 08:33:25 +03:00
bne 1 f
2017-09-15 08:25:48 +03:00
2020-02-25 20:35:27 +03:00
EXCEPTION_ R E S T O R E _ R E G S h s r r =1
2018-01-09 19:07:15 +03:00
HRFI_ T O _ U S E R _ O R _ K E R N E L
2017-09-15 08:25:48 +03:00
2019-06-28 08:33:25 +03:00
1 :
2016-09-21 10:43:49 +03:00
/ *
* Go t o v i r t u a l m o d e a n d p u l l t h e H M I e v e n t i n f o r m a t i o n f r o m
* firmware.
* /
2020-02-25 20:35:27 +03:00
EXCEPTION_ R E S T O R E _ R E G S h s r r =1
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h m i _ e x c e p t i o n , v i r t =0
2016-09-21 10:43:49 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M h m i _ e x c e p t i o n _ e a r l y
2017-09-15 08:25:48 +03:00
EXC_ C O M M O N _ B E G I N ( h m i _ e x c e p t i o n _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N h m i _ e x c e p t i o n
2019-06-22 16:15:21 +03:00
FINISH_ N A P
RUNLATCH_ O N
2019-06-22 16:15:20 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl h a n d l e _ h m i _ e x c e p t i o n
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
powerpc: Save CFAR before branching in interrupt entry paths
Some of the interrupt vectors on 64-bit POWER server processors are
only 32 bytes long, which is not enough for the full first-level
interrupt handler. For these we currently just have a branch to an
out-of-line handler. However, this means that we corrupt the CFAR
(come-from address register) on POWER7 and later processors.
To fix this, we split the EXCEPTION_PROLOG_1 macro into two pieces:
EXCEPTION_PROLOG_0 contains the part up to the point where the CFAR
is saved in the PACA, and EXCEPTION_PROLOG_1 contains the rest. We
then put EXCEPTION_PROLOG_0 in the short interrupt vectors before
we branch to the out-of-line handler, which contains the rest of the
first-level interrupt handler. To facilitate this, we define new
_OOL (out of line) variants of STD_EXCEPTION_PSERIES, etc.
In order to get EXCEPTION_PROLOG_0 to be short enough, i.e., no more
than 6 instructions, it was necessary to move the stores that move
the PPR and CFAR values into the PACA into __EXCEPTION_PROLOG_1 and
to get rid of one of the two HMT_MEDIUM instructions. Previously
there was a HMT_MEDIUM_PPR_DISCARD before the prolog, which was
nop'd out on processors with the PPR (POWER7 and later), and then
another HMT_MEDIUM inside the HMT_MEDIUM_PPR_SAVE macro call inside
__EXCEPTION_PROLOG_1, which was nop'd out on processors without PPR.
Now the HMT_MEDIUM inside EXCEPTION_PROLOG_0 is there unconditionally
and the HMT_MEDIUM_PPR_DISCARD is not strictly necessary, although
this leaves it in for the interrupt vectors where there is room for
it.
Previously we had a handler for hypervisor maintenance interrupts at
0xe50, which doesn't leave enough room for the vector for hypervisor
emulation assist interrupts at 0xe40, since we need 8 instructions.
The 0xe50 vector was only used on POWER6, as the HMI vector was moved
to 0xe60 on POWER7. Since we don't support running in hypervisor mode
on POWER6, we just remove the handler at 0xe50.
This also changes denorm_exception_hv to use EXCEPTION_PROLOG_0
instead of open-coding it, and removes the HMT_MEDIUM_PPR_DISCARD
from the relocation-on vectors (since any CPU that supports
relocation-on interrupts also has the PPR).
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2013-02-04 22:10:15 +04:00
2020-02-25 20:35:21 +03:00
GEN_ K V M h m i _ e x c e p t i o n
2019-08-02 13:56:47 +03:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x e 8 0 - D i r e c t e d H y p e r v i s o r D o o r b e l l I n t e r r u p t .
* This i s a n a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o a m s g s n d d o o r b e l l .
* Similar t o t h e 0 x a00 d o o r b e l l b u t f o r h o s t r a t h e r t h a n g u e s t .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( h _ d o o r b e l l )
IVEC=0xe80
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:14 +03:00
IMASK=IRQS_DISABLED
IKVM_ R E A L =1
IKVM_ V I R T =1
INT_ D E F I N E _ E N D ( h _ d o o r b e l l )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( h _ d o o r b e l l , 0 x e 8 0 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h _ d o o r b e l l , v i r t =0 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( h _ d o o r b e l l , 0 x e 8 0 , 0 x20 )
EXC_ V I R T _ B E G I N ( h _ d o o r b e l l , 0 x4 e 8 0 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h _ d o o r b e l l , v i r t =1 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( h _ d o o r b e l l , 0 x4 e 8 0 , 0 x20 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( h _ d o o r b e l l _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N h _ d o o r b e l l
2020-02-25 20:35:13 +03:00
FINISH_ N A P
RUNLATCH_ O N
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
2016-09-21 10:43:50 +03:00
# ifdef C O N F I G _ P P C _ D O O R B E L L
2020-02-25 20:35:13 +03:00
bl d o o r b e l l _ e x c e p t i o n
2016-09-21 10:43:50 +03:00
# else
2020-02-25 20:35:13 +03:00
bl u n k n o w n _ e x c e p t i o n
2016-09-21 10:43:50 +03:00
# endif
2020-02-25 20:35:38 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:50 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M h _ d o o r b e l l
2009-06-03 01:17:38 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x e a0 - H y p e r v i s o r V i r t u a l i z a t i o n I n t e r r u p t .
* This i s a n a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o a n " e x t e r n a l e x c e p t i o n " .
* Similar t o 0 x50 0 b u t f o r h o s t o n l y .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( h _ v i r t _ i r q )
IVEC=0xea0
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:14 +03:00
IMASK=IRQS_DISABLED
IKVM_ R E A L =1
IKVM_ V I R T =1
INT_ D E F I N E _ E N D ( h _ v i r t _ i r q )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( h _ v i r t _ i r q , 0 x e a0 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h _ v i r t _ i r q , v i r t =0 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( h _ v i r t _ i r q , 0 x e a0 , 0 x20 )
EXC_ V I R T _ B E G I N ( h _ v i r t _ i r q , 0 x4 e a0 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h _ v i r t _ i r q , v i r t =1 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( h _ v i r t _ i r q , 0 x4 e a0 , 0 x20 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( h _ v i r t _ i r q _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N h _ v i r t _ i r q
2020-02-25 20:35:13 +03:00
FINISH_ N A P
RUNLATCH_ O N
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl d o _ I R Q
2020-02-25 20:35:38 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:51 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M h _ v i r t _ i r q
2016-07-08 09:37:06 +03:00
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ N O N E ( 0 x e c0 , 0 x20 )
EXC_ V I R T _ N O N E ( 0 x4 e c0 , 0 x20 )
EXC_ R E A L _ N O N E ( 0 x e e 0 , 0 x20 )
EXC_ V I R T _ N O N E ( 0 x4 e e 0 , 0 x20 )
2016-09-21 10:43:52 +03:00
2009-06-03 01:17:38 +04:00
2020-02-25 20:35:28 +03:00
/ *
* Interrupt 0 x f00 - P e r f o r m a n c e M o n i t o r I n t e r r u p t ( P M I , P M U ) .
* This i s a n a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o a P M U e x c e p t i o n .
* It i s m a s k a b l e i n h a r d w a r e b y c l e a r i n g M S R [ E E ] , a n d s o f t - m a s k a b l e w i t h
* IRQS_ P M I _ D I S A B L E D m a s k ( N O T E : N O T l o c a l _ i r q _ d i s a b l e ( ) ) .
*
* Handling :
* This c a l l s i n t o t h e p e r f s u b s y s t e m .
*
* Like t h e w a t c h d o g s o f t - n m i , i t a p p e a r s a n N M I i n t e r r u p t t o L i n u x , i n t h a t i t
* runs u n d e r l o c a l _ i r q _ d i s a b l e . H o w e v e r i t m a y b e s o f t - m a s k e d i n
* powerpc- s p e c i f i c c o d e .
*
* If s o f t m a s k e d , t h e m a s k e d h a n d l e r w i l l n o t e t h e p e n d i n g i n t e r r u p t f o r
* replay, a n d c l e a r M S R [ E E ] i n t h e i n t e r r u p t e d c o n t e x t .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( p e r f o r m a n c e _ m o n i t o r )
IVEC=0xf00
IMASK=IRQS_PMI_DISABLED
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( p e r f o r m a n c e _ m o n i t o r )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( p e r f o r m a n c e _ m o n i t o r , 0 x f00 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y p e r f o r m a n c e _ m o n i t o r , v i r t =0 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( p e r f o r m a n c e _ m o n i t o r , 0 x f00 , 0 x20 )
EXC_ V I R T _ B E G I N ( p e r f o r m a n c e _ m o n i t o r , 0 x4 f00 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y p e r f o r m a n c e _ m o n i t o r , v i r t =1 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( p e r f o r m a n c e _ m o n i t o r , 0 x4 f00 , 0 x20 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( p e r f o r m a n c e _ m o n i t o r _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N p e r f o r m a n c e _ m o n i t o r
2020-02-25 20:35:13 +03:00
FINISH_ N A P
RUNLATCH_ O N
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl p e r f o r m a n c e _ m o n i t o r _ e x c e p t i o n
2020-02-25 20:35:38 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:53 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M p e r f o r m a n c e _ m o n i t o r
2009-06-03 01:17:38 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x f20 - V e c t o r U n a v a i l a b l e I n t e r r u p t .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o
* executing a v e c t o r ( o r a l t i v e c ) i n s t r u c t i o n w i t h M S R [ V E C ] =0 .
* Similar t o F P u n a v a i l a b l e .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( a l t i v e c _ u n a v a i l a b l e )
IVEC=0xf20
IRECONCILE=0
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( a l t i v e c _ u n a v a i l a b l e )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( a l t i v e c _ u n a v a i l a b l e , 0 x f20 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y a l t i v e c _ u n a v a i l a b l e , v i r t =0 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( a l t i v e c _ u n a v a i l a b l e , 0 x f20 , 0 x20 )
EXC_ V I R T _ B E G I N ( a l t i v e c _ u n a v a i l a b l e , 0 x4 f20 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y a l t i v e c _ u n a v a i l a b l e , v i r t =1 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( a l t i v e c _ u n a v a i l a b l e , 0 x4 f20 , 0 x20 )
2016-09-21 10:43:54 +03:00
EXC_ C O M M O N _ B E G I N ( a l t i v e c _ u n a v a i l a b l e _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N a l t i v e c _ u n a v a i l a b l e
2016-09-21 10:43:54 +03:00
# ifdef C O N F I G _ A L T I V E C
BEGIN_ F T R _ S E C T I O N
beq 1 f
# ifdef C O N F I G _ P P C _ T R A N S A C T I O N A L _ M E M
BEGIN_ F T R _ S E C T I O N _ N E S T E D ( 6 9 )
/ * Test i f 2 T M s t a t e b i t s a r e z e r o . I f n o n - z e r o ( i e . u s e r s p a c e w a s i n
* transaction) , g o d o T M s t u f f
* /
rldicl. r0 , r12 , ( 6 4 - M S R _ T S _ L G ) , ( 6 4 - 2 )
bne- 2 f
END_ F T R _ S E C T I O N _ N E S T E D ( C P U _ F T R _ T M , C P U _ F T R _ T M , 6 9 )
# endif
bl l o a d _ u p _ a l t i v e c
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b f a s t _ i n t e r r u p t _ r e t u r n
2016-09-21 10:43:54 +03:00
# ifdef C O N F I G _ P P C _ T R A N S A C T I O N A L _ M E M
2 : /* User process was in a transaction */
RECONCILE_ I R Q _ S T A T E ( r10 , r11 )
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl a l t i v e c _ u n a v a i l a b l e _ t m
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:54 +03:00
# endif
1 :
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ A L T I V E C )
# endif
RECONCILE_ I R Q _ S T A T E ( r10 , r11 )
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl a l t i v e c _ u n a v a i l a b l e _ e x c e p t i o n
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:54 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M a l t i v e c _ u n a v a i l a b l e
2009-06-03 01:17:38 +04:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x f40 - V S X U n a v a i l a b l e I n t e r r u p t .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o
* executing a V S X i n s t r u c t i o n w i t h M S R [ V S X ] =0 .
* Similar t o F P u n a v a i l a b l e .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( v s x _ u n a v a i l a b l e )
IVEC=0xf40
IRECONCILE=0
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( v s x _ u n a v a i l a b l e )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( v s x _ u n a v a i l a b l e , 0 x f40 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y v s x _ u n a v a i l a b l e , v i r t =0 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( v s x _ u n a v a i l a b l e , 0 x f40 , 0 x20 )
EXC_ V I R T _ B E G I N ( v s x _ u n a v a i l a b l e , 0 x4 f40 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y v s x _ u n a v a i l a b l e , v i r t =1 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( v s x _ u n a v a i l a b l e , 0 x4 f40 , 0 x20 )
2016-09-21 10:43:55 +03:00
EXC_ C O M M O N _ B E G I N ( v s x _ u n a v a i l a b l e _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N v s x _ u n a v a i l a b l e
2016-09-21 10:43:55 +03:00
# ifdef C O N F I G _ V S X
BEGIN_ F T R _ S E C T I O N
beq 1 f
# ifdef C O N F I G _ P P C _ T R A N S A C T I O N A L _ M E M
BEGIN_ F T R _ S E C T I O N _ N E S T E D ( 6 9 )
/ * Test i f 2 T M s t a t e b i t s a r e z e r o . I f n o n - z e r o ( i e . u s e r s p a c e w a s i n
* transaction) , g o d o T M s t u f f
* /
rldicl. r0 , r12 , ( 6 4 - M S R _ T S _ L G ) , ( 6 4 - 2 )
bne- 2 f
END_ F T R _ S E C T I O N _ N E S T E D ( C P U _ F T R _ T M , C P U _ F T R _ T M , 6 9 )
# endif
b l o a d _ u p _ v s x
# ifdef C O N F I G _ P P C _ T R A N S A C T I O N A L _ M E M
2 : /* User process was in a transaction */
RECONCILE_ I R Q _ S T A T E ( r10 , r11 )
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl v s x _ u n a v a i l a b l e _ t m
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:55 +03:00
# endif
1 :
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ V S X )
# endif
RECONCILE_ I R Q _ S T A T E ( r10 , r11 )
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl v s x _ u n a v a i l a b l e _ e x c e p t i o n
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:55 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M v s x _ u n a v a i l a b l e
2016-09-30 12:43:18 +03:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x f60 - F a c i l i t y U n a v a i l a b l e I n t e r r u p t .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o
* executing a n i n s t r u c t i o n w i t h o u t a c c e s s t o t h e f a c i l i t y t h a t c a n b e
* resolved b y t h e O S ( e . g . , F S C R , M S R ) .
* Similar t o F P u n a v a i l a b l e .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( f a c i l i t y _ u n a v a i l a b l e )
IVEC=0xf60
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( f a c i l i t y _ u n a v a i l a b l e )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( f a c i l i t y _ u n a v a i l a b l e , 0 x f60 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y f a c i l i t y _ u n a v a i l a b l e , v i r t =0 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( f a c i l i t y _ u n a v a i l a b l e , 0 x f60 , 0 x20 )
EXC_ V I R T _ B E G I N ( f a c i l i t y _ u n a v a i l a b l e , 0 x4 f60 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y f a c i l i t y _ u n a v a i l a b l e , v i r t =1 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( f a c i l i t y _ u n a v a i l a b l e , 0 x4 f60 , 0 x20 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( f a c i l i t y _ u n a v a i l a b l e _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N f a c i l i t y _ u n a v a i l a b l e
2020-02-25 20:35:13 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl f a c i l i t y _ u n a v a i l a b l e _ e x c e p t i o n
powerpc/64s: Fix restore of NV GPRs after facility unavailable exception
Commit 702f09805222 ("powerpc/64s/exception: Remove lite interrupt
return") changed the interrupt return path to not restore non-volatile
registers by default, and explicitly restore them in paths where it is
required.
But it missed that the facility unavailable exception can sometimes
modify user registers, ie. when it does emulation of move from DSCR.
This is seen as a failure of the dscr_sysfs_thread_test:
test: dscr_sysfs_thread_test
[cpu 0] User DSCR should be 1 but is 0
failure: dscr_sysfs_thread_test
So restore non-volatile GPRs after facility unavailable exceptions.
Currently the hypervisor facility unavailable exception is also wired
up to call facility_unavailable_exception().
In practice we should never take a hypervisor facility unavailable
exception for the DSCR. On older bare metal systems we set HFSCR_DSCR
unconditionally in __init_HFSCR, or on newer systems it should be
enabled via the "data-stream-control-register" device tree CPU
feature.
Even if it's not, since commit f3c99f97a3cd ("KVM: PPC: Book3S HV:
Don't access HFSCR, LPIDR or LPCR when running nested"), the KVM code
has unconditionally set HFSCR_DSCR when running guests.
So we should only get a hypervisor facility unavailable for the DSCR
if skiboot has disabled the "data-stream-control-register" feature,
and we are somehow in guest context but not via KVM.
Given all that, it should be unnecessary to add a restore of
non-volatile GPRs after the hypervisor facility exception, because we
never expect to hit that path. But equally we may as well add the
restore, because we never expect to hit that path, and if we ever did,
at least we would correctly restore the registers to their post
emulation state.
In future we can split the non-HV and HV facility unavailable handling
so that there is no emulation in the HV handler, and then remove the
restore for the HV case.
Fixes: 702f09805222 ("powerpc/64s/exception: Remove lite interrupt return")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200526061808.2472279-1-mpe@ellerman.id.au
2020-05-26 09:18:08 +03:00
REST_ N V G P R S ( r1 ) / * i n s t r u c t i o n e m u l a t i o n m a y c h a n g e G P R s * /
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:56 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M f a c i l i t y _ u n a v a i l a b l e
2016-09-30 12:43:18 +03:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x f60 - H y p e r v i s o r F a c i l i t y U n a v a i l a b l e I n t e r r u p t .
* This i s a s y n c h r o n o u s i n t e r r u p t i n r e s p o n s e t o
* executing a n i n s t r u c t i o n w i t h o u t a c c e s s t o t h e f a c i l i t y t h a t c a n o n l y
* be r e s o l v e d i n H V m o d e ( e . g . , H F S C R ) .
* Similar t o F P u n a v a i l a b l e .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( h _ f a c i l i t y _ u n a v a i l a b l e )
IVEC=0xf80
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
IKVM_ V I R T =1
INT_ D E F I N E _ E N D ( h _ f a c i l i t y _ u n a v a i l a b l e )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( h _ f a c i l i t y _ u n a v a i l a b l e , 0 x f80 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h _ f a c i l i t y _ u n a v a i l a b l e , v i r t =0 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( h _ f a c i l i t y _ u n a v a i l a b l e , 0 x f80 , 0 x20 )
EXC_ V I R T _ B E G I N ( h _ f a c i l i t y _ u n a v a i l a b l e , 0 x4 f80 , 0 x20 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y h _ f a c i l i t y _ u n a v a i l a b l e , v i r t =1 , o o l =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( h _ f a c i l i t y _ u n a v a i l a b l e , 0 x4 f80 , 0 x20 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( h _ f a c i l i t y _ u n a v a i l a b l e _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N h _ f a c i l i t y _ u n a v a i l a b l e
2020-02-25 20:35:13 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl f a c i l i t y _ u n a v a i l a b l e _ e x c e p t i o n
powerpc/64s: Fix restore of NV GPRs after facility unavailable exception
Commit 702f09805222 ("powerpc/64s/exception: Remove lite interrupt
return") changed the interrupt return path to not restore non-volatile
registers by default, and explicitly restore them in paths where it is
required.
But it missed that the facility unavailable exception can sometimes
modify user registers, ie. when it does emulation of move from DSCR.
This is seen as a failure of the dscr_sysfs_thread_test:
test: dscr_sysfs_thread_test
[cpu 0] User DSCR should be 1 but is 0
failure: dscr_sysfs_thread_test
So restore non-volatile GPRs after facility unavailable exceptions.
Currently the hypervisor facility unavailable exception is also wired
up to call facility_unavailable_exception().
In practice we should never take a hypervisor facility unavailable
exception for the DSCR. On older bare metal systems we set HFSCR_DSCR
unconditionally in __init_HFSCR, or on newer systems it should be
enabled via the "data-stream-control-register" device tree CPU
feature.
Even if it's not, since commit f3c99f97a3cd ("KVM: PPC: Book3S HV:
Don't access HFSCR, LPIDR or LPCR when running nested"), the KVM code
has unconditionally set HFSCR_DSCR when running guests.
So we should only get a hypervisor facility unavailable for the DSCR
if skiboot has disabled the "data-stream-control-register" feature,
and we are somehow in guest context but not via KVM.
Given all that, it should be unnecessary to add a restore of
non-volatile GPRs after the hypervisor facility exception, because we
never expect to hit that path. But equally we may as well add the
restore, because we never expect to hit that path, and if we ever did,
at least we would correctly restore the registers to their post
emulation state.
In future we can split the non-HV and HV facility unavailable handling
so that there is no emulation in the HV handler, and then remove the
restore for the HV case.
Fixes: 702f09805222 ("powerpc/64s/exception: Remove lite interrupt return")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200526061808.2472279-1-mpe@ellerman.id.au
2020-05-26 09:18:08 +03:00
REST_ N V G P R S ( r1 ) / * X X X S h o u l d n ' t b e n e c e s s a r y i n p r a c t i c e * /
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:43:57 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M h _ f a c i l i t y _ u n a v a i l a b l e
2016-09-30 12:43:18 +03:00
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ N O N E ( 0 x f a0 , 0 x20 )
EXC_ V I R T _ N O N E ( 0 x4 f a0 , 0 x20 )
EXC_ R E A L _ N O N E ( 0 x f c0 , 0 x20 )
EXC_ V I R T _ N O N E ( 0 x4 f c0 , 0 x20 )
EXC_ R E A L _ N O N E ( 0 x f e 0 , 0 x20 )
EXC_ V I R T _ N O N E ( 0 x4 f e 0 , 0 x20 )
EXC_ R E A L _ N O N E ( 0 x10 0 0 , 0 x10 0 )
EXC_ V I R T _ N O N E ( 0 x50 0 0 , 0 x10 0 )
EXC_ R E A L _ N O N E ( 0 x11 0 0 , 0 x10 0 )
EXC_ V I R T _ N O N E ( 0 x51 0 0 , 0 x10 0 )
2013-02-13 20:21:38 +04:00
2009-06-03 01:17:38 +04:00
# ifdef C O N F I G _ C B E _ R A S
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( c b e _ s y s t e m _ e r r o r )
IVEC=0x1200
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:14 +03:00
IKVM_ S K I P =1
IKVM_ R E A L =1
INT_ D E F I N E _ E N D ( c b e _ s y s t e m _ e r r o r )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( c b e _ s y s t e m _ e r r o r , 0 x12 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y c b e _ s y s t e m _ e r r o r , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( c b e _ s y s t e m _ e r r o r , 0 x12 0 0 , 0 x10 0 )
2016-12-06 04:41:12 +03:00
EXC_ V I R T _ N O N E ( 0 x52 0 0 , 0 x10 0 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( c b e _ s y s t e m _ e r r o r _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N c b e _ s y s t e m _ e r r o r
2020-02-25 20:35:13 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl c b e _ s y s t e m _ e r r o r _ e x c e p t i o n
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2020-02-25 20:35:21 +03:00
GEN_ K V M c b e _ s y s t e m _ e r r o r
2016-09-30 12:43:18 +03:00
# else / * C O N F I G _ C B E _ R A S * /
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ N O N E ( 0 x12 0 0 , 0 x10 0 )
EXC_ V I R T _ N O N E ( 0 x52 0 0 , 0 x10 0 )
2016-09-30 12:43:18 +03:00
# endif
2011-06-29 04:18:26 +04:00
2016-09-21 10:43:59 +03:00
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( i n s t r u c t i o n _ b r e a k p o i n t )
IVEC=0x1300
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ S K I P =1
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( i n s t r u c t i o n _ b r e a k p o i n t )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( i n s t r u c t i o n _ b r e a k p o i n t , 0 x13 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y i n s t r u c t i o n _ b r e a k p o i n t , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( i n s t r u c t i o n _ b r e a k p o i n t , 0 x13 0 0 , 0 x10 0 )
EXC_ V I R T _ B E G I N ( i n s t r u c t i o n _ b r e a k p o i n t , 0 x53 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y i n s t r u c t i o n _ b r e a k p o i n t , v i r t =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( i n s t r u c t i o n _ b r e a k p o i n t , 0 x53 0 0 , 0 x10 0 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( i n s t r u c t i o n _ b r e a k p o i n t _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N i n s t r u c t i o n _ b r e a k p o i n t
2020-02-25 20:35:13 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl i n s t r u c t i o n _ b r e a k p o i n t _ e x c e p t i o n
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:44:00 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M i n s t r u c t i o n _ b r e a k p o i n t
2019-08-02 13:56:47 +03:00
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ N O N E ( 0 x14 0 0 , 0 x10 0 )
EXC_ V I R T _ N O N E ( 0 x54 0 0 , 0 x10 0 )
2016-09-30 12:43:18 +03:00
2020-02-25 20:35:28 +03:00
/ * *
* Interrupt 0 x15 0 0 - S o f t P a t c h I n t e r r u p t
*
* Handling :
* This i s a n i m p l e m e n t a t i o n s p e c i f i c i n t e r r u p t w h i c h c a n b e u s e d f o r a
* range o f e x c e p t i o n s .
*
* This i n t e r r u p t h a n d l e r i s u n i q u e i n t h a t i t r u n s t h e d e n o r m a l a s s i s t
* code e v e n f o r g u e s t s ( a n d e v e n i n g u e s t c o n t e x t ) w i t h o u t g o i n g t o K V M ,
* for s p e e d . P O W E R 9 d o e s n o t r a i s e d e n o r m e x c e p t i o n s , s o t h i s s p e c i a l c a s e
* could b e p h a s e d o u t i n f u t u r e t o r e d u c e s p e c i a l c a s e s .
* /
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( d e n o r m _ e x c e p t i o n )
IVEC=0x1500
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-07-08 10:49:42 +03:00
IBRANCH_ T O _ C O M M O N =0
2020-02-25 20:35:21 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( d e n o r m _ e x c e p t i o n )
EXC_ R E A L _ B E G I N ( d e n o r m _ e x c e p t i o n , 0 x15 0 0 , 0 x10 0 )
GEN_ I N T _ E N T R Y d e n o r m _ e x c e p t i o n , v i r t =0
2012-09-10 04:35:26 +04:00
# ifdef C O N F I G _ P P C _ D E N O R M A L I S A T I O N
2020-02-25 20:35:22 +03:00
andis. r10 ,r12 ,( H S R R 1 _ D E N O R M ) @h /* denorm? */
2016-09-21 10:43:31 +03:00
bne+ d e n o r m _ a s s i s t
# endif
2020-02-25 20:35:19 +03:00
GEN_ B R A N C H _ T O _ C O M M O N d e n o r m _ e x c e p t i o n , v i r t =0
2020-02-25 20:35:14 +03:00
EXC_ R E A L _ E N D ( d e n o r m _ e x c e p t i o n , 0 x15 0 0 , 0 x10 0 )
2016-09-21 10:44:01 +03:00
# ifdef C O N F I G _ P P C _ D E N O R M A L I S A T I O N
2016-12-06 04:41:12 +03:00
EXC_ V I R T _ B E G I N ( d e n o r m _ e x c e p t i o n , 0 x55 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y d e n o r m _ e x c e p t i o n , v i r t =1
2020-02-25 20:35:22 +03:00
andis. r10 ,r12 ,( H S R R 1 _ D E N O R M ) @h /* denorm? */
2019-08-02 13:56:49 +03:00
bne+ d e n o r m _ a s s i s t
2020-02-25 20:35:19 +03:00
GEN_ B R A N C H _ T O _ C O M M O N d e n o r m _ e x c e p t i o n , v i r t =1
2016-12-06 04:41:12 +03:00
EXC_ V I R T _ E N D ( d e n o r m _ e x c e p t i o n , 0 x55 0 0 , 0 x10 0 )
2016-09-21 10:44:01 +03:00
# else
2016-12-06 04:41:12 +03:00
EXC_ V I R T _ N O N E ( 0 x55 0 0 , 0 x10 0 )
2016-09-21 10:43:31 +03:00
# endif
2011-06-29 04:18:26 +04:00
2012-09-10 04:35:26 +04:00
# ifdef C O N F I G _ P P C _ D E N O R M A L I S A T I O N
2016-09-30 12:43:18 +03:00
TRAMP_ R E A L _ B E G I N ( d e n o r m _ a s s i s t )
2012-09-10 04:35:26 +04:00
BEGIN_ F T R _ S E C T I O N
/ *
* To d e n o r m a l i s e w e n e e d t o m o v e a c o p y o f t h e r e g i s t e r t o i t s e l f .
* For P O W E R 6 d o t h a t h e r e f o r a l l F P r e g s .
* /
mfmsr r10
ori r10 ,r10 ,( M S R _ F P | M S R _ F E 0 | M S R _ F E 1 )
xori r10 ,r10 ,( M S R _ F E 0 | M S R _ F E 1 )
mtmsrd r10
sync
2013-05-30 01:33:18 +04:00
2019-06-22 16:15:33 +03:00
.Lreg = 0
.rept 32
fmr . L r e g ,. L r e g
.Lreg = .Lreg + 1
.endr
2013-05-30 01:33:18 +04:00
2012-09-10 04:35:26 +04:00
FTR_ S E C T I O N _ E L S E
/ *
* To d e n o r m a l i s e w e n e e d t o m o v e a c o p y o f t h e r e g i s t e r t o i t s e l f .
* For P O W E R 7 d o t h a t h e r e f o r t h e f i r s t 3 2 V S X r e g i s t e r s o n l y .
* /
mfmsr r10
oris r10 ,r10 ,M S R _ V S X @h
mtmsrd r10
sync
2013-05-30 01:33:18 +04:00
2019-06-22 16:15:33 +03:00
.Lreg = 0
.rept 32
XVCPSGNDP( . L r e g ,. L r e g ,. L r e g )
.Lreg = .Lreg + 1
.endr
2013-05-30 01:33:18 +04:00
2012-09-10 04:35:26 +04:00
ALT_ F T R _ S E C T I O N _ E N D _ I F C L R ( C P U _ F T R _ A R C H _ 2 0 6 )
2013-05-30 01:33:19 +04:00
BEGIN_ F T R _ S E C T I O N
b d e n o r m _ d o n e
END_ F T R _ S E C T I O N _ I F C L R ( C P U _ F T R _ A R C H _ 2 0 7 S )
/ *
* To d e n o r m a l i s e w e n e e d t o m o v e a c o p y o f t h e r e g i s t e r t o i t s e l f .
* For P O W E R 8 w e n e e d t o d o t h a t f o r a l l 6 4 V S X r e g i s t e r s
* /
2019-06-22 16:15:33 +03:00
.Lreg = 3 2
.rept 32
XVCPSGNDP( . L r e g ,. L r e g ,. L r e g )
.Lreg = .Lreg + 1
.endr
2013-05-30 01:33:19 +04:00
denorm_done :
2018-09-13 08:33:47 +03:00
mfspr r11 ,S P R N _ H S R R 0
subi r11 ,r11 ,4
2012-09-10 04:35:26 +04:00
mtspr S P R N _ H S R R 0 ,r11
mtcrf 0 x80 ,r9
ld r9 ,P A C A _ E X G E N + E X _ R 9 ( r13 )
2020-02-25 20:35:23 +03:00
BEGIN_ F T R _ S E C T I O N
ld r10 ,P A C A _ E X G E N + E X _ P P R ( r13 )
mtspr S P R N _ P P R ,r10
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ H A S _ P P R )
2013-08-12 10:12:06 +04:00
BEGIN_ F T R _ S E C T I O N
ld r10 ,P A C A _ E X G E N + E X _ C F A R ( r13 )
mtspr S P R N _ C F A R ,r10
END_ F T R _ S E C T I O N _ I F S E T ( C P U _ F T R _ C F A R )
2012-09-10 04:35:26 +04:00
ld r10 ,P A C A _ E X G E N + E X _ R 1 0 ( r13 )
ld r11 ,P A C A _ E X G E N + E X _ R 1 1 ( r13 )
ld r12 ,P A C A _ E X G E N + E X _ R 1 2 ( r13 )
ld r13 ,P A C A _ E X G E N + E X _ R 1 3 ( r13 )
2018-01-09 19:07:15 +03:00
HRFI_ T O _ U N K N O W N
2012-09-10 04:35:26 +04:00
b .
# endif
2020-02-25 20:35:14 +03:00
EXC_ C O M M O N _ B E G I N ( d e n o r m _ e x c e p t i o n _ c o m m o n )
GEN_ C O M M O N d e n o r m _ e x c e p t i o n
2020-02-25 20:35:13 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl u n k n o w n _ e x c e p t i o n
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:44:01 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M d e n o r m _ e x c e p t i o n
2016-09-21 10:44:01 +03:00
# ifdef C O N F I G _ C B E _ R A S
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( c b e _ m a i n t e n a n c e )
IVEC=0x1600
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:14 +03:00
IKVM_ S K I P =1
IKVM_ R E A L =1
INT_ D E F I N E _ E N D ( c b e _ m a i n t e n a n c e )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( c b e _ m a i n t e n a n c e , 0 x16 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y c b e _ m a i n t e n a n c e , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( c b e _ m a i n t e n a n c e , 0 x16 0 0 , 0 x10 0 )
2016-12-06 04:41:12 +03:00
EXC_ V I R T _ N O N E ( 0 x56 0 0 , 0 x10 0 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( c b e _ m a i n t e n a n c e _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N c b e _ m a i n t e n a n c e
2020-02-25 20:35:13 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl c b e _ m a i n t e n a n c e _ e x c e p t i o n
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2020-02-25 20:35:21 +03:00
GEN_ K V M c b e _ m a i n t e n a n c e
2016-09-21 10:44:01 +03:00
# else / * C O N F I G _ C B E _ R A S * /
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ N O N E ( 0 x16 0 0 , 0 x10 0 )
EXC_ V I R T _ N O N E ( 0 x56 0 0 , 0 x10 0 )
2016-09-21 10:44:01 +03:00
# endif
2016-09-21 10:44:02 +03:00
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( a l t i v e c _ a s s i s t )
IVEC=0x1700
2020-02-25 20:35:29 +03:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ P R _ P O S S I B L E
2020-02-25 20:35:14 +03:00
IKVM_ R E A L =1
2020-02-25 20:35:29 +03:00
# endif
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ E N D ( a l t i v e c _ a s s i s t )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( a l t i v e c _ a s s i s t , 0 x17 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y a l t i v e c _ a s s i s t , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( a l t i v e c _ a s s i s t , 0 x17 0 0 , 0 x10 0 )
EXC_ V I R T _ B E G I N ( a l t i v e c _ a s s i s t , 0 x57 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y a l t i v e c _ a s s i s t , v i r t =1
2019-08-02 13:56:47 +03:00
EXC_ V I R T _ E N D ( a l t i v e c _ a s s i s t , 0 x57 0 0 , 0 x10 0 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( a l t i v e c _ a s s i s t _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N a l t i v e c _ a s s i s t
2020-02-25 20:35:13 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
2016-09-21 10:44:03 +03:00
# ifdef C O N F I G _ A L T I V E C
2020-02-25 20:35:13 +03:00
bl a l t i v e c _ a s s i s t _ e x c e p t i o n
2020-02-25 20:35:38 +03:00
REST_ N V G P R S ( r1 ) / * i n s t r u c t i o n e m u l a t i o n m a y c h a n g e G P R s * /
2016-09-21 10:44:03 +03:00
# else
2020-02-25 20:35:13 +03:00
bl u n k n o w n _ e x c e p t i o n
2016-09-21 10:44:03 +03:00
# endif
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-09-21 10:44:03 +03:00
2020-02-25 20:35:21 +03:00
GEN_ K V M a l t i v e c _ a s s i s t
2016-09-21 10:44:01 +03:00
# ifdef C O N F I G _ C B E _ R A S
2020-02-25 20:35:14 +03:00
INT_ D E F I N E _ B E G I N ( c b e _ t h e r m a l )
IVEC=0x1800
2020-02-25 20:35:27 +03:00
IHSRR=1
2020-02-25 20:35:14 +03:00
IKVM_ S K I P =1
IKVM_ R E A L =1
INT_ D E F I N E _ E N D ( c b e _ t h e r m a l )
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ B E G I N ( c b e _ t h e r m a l , 0 x18 0 0 , 0 x10 0 )
2020-02-25 20:35:14 +03:00
GEN_ I N T _ E N T R Y c b e _ t h e r m a l , v i r t =0
2019-08-02 13:56:47 +03:00
EXC_ R E A L _ E N D ( c b e _ t h e r m a l , 0 x18 0 0 , 0 x10 0 )
2016-12-06 04:41:12 +03:00
EXC_ V I R T _ N O N E ( 0 x58 0 0 , 0 x10 0 )
2020-02-25 20:35:13 +03:00
EXC_ C O M M O N _ B E G I N ( c b e _ t h e r m a l _ c o m m o n )
2020-02-25 20:35:14 +03:00
GEN_ C O M M O N c b e _ t h e r m a l
2020-02-25 20:35:13 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl c b e _ t h e r m a l _ e x c e p t i o n
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2020-02-25 20:35:21 +03:00
GEN_ K V M c b e _ t h e r m a l
2016-09-21 10:44:01 +03:00
# else / * C O N F I G _ C B E _ R A S * /
2016-12-06 04:41:12 +03:00
EXC_ R E A L _ N O N E ( 0 x18 0 0 , 0 x10 0 )
EXC_ V I R T _ N O N E ( 0 x58 0 0 , 0 x10 0 )
2016-09-21 10:44:01 +03:00
# endif
2019-08-02 13:56:47 +03:00
2017-08-01 15:00:52 +03:00
# ifdef C O N F I G _ P P C _ W A T C H D O G
2017-07-13 00:35:52 +03:00
2020-02-25 20:35:20 +03:00
INT_ D E F I N E _ B E G I N ( s o f t _ n m i )
IVEC=0x900
ISTACK=0
2020-02-25 20:35:31 +03:00
IRECONCILE=0 / * S o f t - N M I m a y f i r e u n d e r l o c a l _ i r q _ d i s a b l e * /
2020-02-25 20:35:20 +03:00
INT_ D E F I N E _ E N D ( s o f t _ n m i )
2017-07-13 00:35:52 +03:00
2017-07-29 15:50:27 +03:00
/ *
* Branch t o s o f t _ n m i _ i n t e r r u p t u s i n g t h e e m e r g e n c y s t a c k . T h e e m e r g e n c y
* stack i s o n e t h a t i s u s a b l e b y m a s k a b l e i n t e r r u p t s s o l o n g a s M S R _ E E
* remains o f f . I t i s u s e d f o r r e c o v e r y w h e n s o m e t h i n g h a s c o r r u p t e d t h e
* normal k e r n e l s t a c k , f o r e x a m p l e . T h e " s o f t N M I " m u s t n o t u s e t h e p r o c e s s
* stack b e c a u s e w e w a n t i r q d i s a b l e d s e c t i o n s t o a v o i d t o u c h i n g t h e s t a c k
* at a l l ( o t h e r t h a n P M U i n t e r r u p t s ) , s o u s e t h e e m e r g e n c y s t a c k f o r t h i s ,
* and r u n i t e n t i r e l y w i t h i n t e r r u p t s h a r d d i s a b l e d .
* /
2017-07-13 00:35:52 +03:00
EXC_ C O M M O N _ B E G I N ( s o f t _ n m i _ c o m m o n )
2020-02-25 20:35:20 +03:00
mfspr r11 ,S P R N _ S R R 0
2017-07-13 00:35:52 +03:00
mr r10 ,r1
ld r1 ,P A C A E M E R G S P ( r13 )
subi r1 ,r1 ,I N T _ F R A M E _ S I Z E
2020-02-25 20:35:20 +03:00
_ _ GEN_ C O M M O N _ B O D Y s o f t _ n m i
2020-02-25 20:35:31 +03:00
/ *
* Set I R Q S _ A L L _ D I S A B L E D a n d s a v e P A C A I R Q H A P P E N E D ( s e e
* system_ r e s e t _ c o m m o n )
* /
li r10 ,I R Q S _ A L L _ D I S A B L E D
stb r10 ,P A C A I R Q S O F T M A S K ( r13 )
lbz r10 ,P A C A I R Q H A P P E N E D ( r13 )
2020-05-08 07:33:55 +03:00
std r10 ,R E S U L T ( r1 )
2020-02-25 20:35:31 +03:00
ori r10 ,r10 ,P A C A _ I R Q _ H A R D _ D I S
stb r10 ,P A C A I R Q H A P P E N E D ( r13 )
2019-06-22 16:15:20 +03:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
bl s o f t _ n m i _ i n t e r r u p t
2020-02-25 20:35:31 +03:00
/* Clear MSR_RI before setting SRR0 and SRR1. */
li r9 ,0
mtmsrd r9 ,1
/ *
* Restore s o f t m a s k s e t t i n g s .
* /
2020-05-08 07:33:55 +03:00
ld r10 ,R E S U L T ( r1 )
2020-02-25 20:35:31 +03:00
stb r10 ,P A C A I R Q H A P P E N E D ( r13 )
ld r10 ,S O F T E ( r1 )
stb r10 ,P A C A I R Q S O F T M A S K ( r13 )
2020-04-29 09:56:54 +03:00
kuap_ r e s t o r e _ a m r r9 , r10
2020-02-25 20:35:31 +03:00
EXCEPTION_ R E S T O R E _ R E G S h s r r =0
RFI_ T O _ K E R N E L
2017-07-13 00:35:52 +03:00
2017-08-01 15:00:52 +03:00
# endif / * C O N F I G _ P P C _ W A T C H D O G * /
2016-09-21 10:44:01 +03:00
2009-06-03 01:17:38 +04:00
/ *
2012-11-14 22:49:48 +04:00
* An i n t e r r u p t c a m e i n w h i l e s o f t - d i s a b l e d . W e s e t p a c a - > i r q _ h a p p e n e d , t h e n :
* - If i t w a s a d e c r e m e n t e r i n t e r r u p t , w e b u m p t h e d e c t o m a x a n d a n d r e t u r n .
* - If i t w a s a d o o r b e l l w e r e t u r n i m m e d i a t e l y s i n c e d o o r b e l l s a r e e d g e
* triggered a n d w o n ' t a u t o m a t i c a l l y r e f i r e .
2014-07-29 17:10:01 +04:00
* - If i t w a s a H M I w e r e t u r n i m m e d i a t e l y s i n c e w e h a n d l e d i t i n r e a l m o d e
* and i t w o n ' t r e f i r e .
2018-02-03 10:17:50 +03:00
* - Else i t i s o n e o f P A C A _ I R Q _ M U S T _ H A R D _ M A S K , s o h a r d d i s a b l e a n d r e t u r n .
2012-11-14 22:49:48 +04:00
* This i s c a l l e d w i t h r10 c o n t a i n i n g t h e v a l u e t o O R t o t h e p a c a f i e l d .
2009-06-03 01:17:38 +04:00
* /
2020-02-25 20:35:27 +03:00
.macro MASKED_INTERRUPT hsrr=0
2019-06-22 16:15:11 +03:00
.if \ hsrr
masked_Hinterrupt :
.else
masked_interrupt :
.endif
lbz r11 ,P A C A I R Q H A P P E N E D ( r13 )
or r11 ,r11 ,r10
stb r11 ,P A C A I R Q H A P P E N E D ( r13 )
cmpwi r10 ,P A C A _ I R Q _ D E C
bne 1 f
lis r10 ,0 x7 f f f
ori r10 ,r10 ,0 x f f f f
mtspr S P R N _ D E C ,r10
2020-02-25 20:35:20 +03:00
# ifdef C O N F I G _ P P C _ W A T C H D O G
b s o f t _ n m i _ c o m m o n
# else
b 2 f
# endif
2019-06-22 16:15:11 +03:00
1 : andi. r10 ,r10 ,P A C A _ I R Q _ M U S T _ H A R D _ M A S K
beq 2 f
2020-02-25 20:35:20 +03:00
xori r12 ,r12 ,M S R _ E E / * c l e a r M S R _ E E * /
2019-06-22 16:15:11 +03:00
.if \ hsrr
2020-02-25 20:35:20 +03:00
mtspr S P R N _ H S R R 1 ,r12
2019-06-22 16:15:11 +03:00
.else
2020-02-25 20:35:20 +03:00
mtspr S P R N _ S R R 1 ,r12
2019-06-22 16:15:11 +03:00
.endif
ori r11 ,r11 ,P A C A _ I R Q _ H A R D _ D I S
stb r11 ,P A C A I R Q H A P P E N E D ( r13 )
2 : /* done */
2020-02-25 20:35:20 +03:00
ld r10 ,P A C A _ E X G E N + E X _ C T R ( r13 )
mtctr r10
2019-06-22 16:15:11 +03:00
mtcrf 0 x80 ,r9
std r1 ,P A C A R 1 ( r13 )
ld r9 ,P A C A _ E X G E N + E X _ R 9 ( r13 )
ld r10 ,P A C A _ E X G E N + E X _ R 1 0 ( r13 )
ld r11 ,P A C A _ E X G E N + E X _ R 1 1 ( r13 )
2020-02-25 20:35:20 +03:00
ld r12 ,P A C A _ E X G E N + E X _ R 1 2 ( r13 )
2020-06-11 11:12:02 +03:00
ld r13 ,P A C A _ E X G E N + E X _ R 1 3 ( r13 )
/* May return to masked low address where r13 is not set up */
2019-06-22 16:15:11 +03:00
.if \ hsrr
HRFI_ T O _ K E R N E L
.else
RFI_ T O _ K E R N E L
.endif
b .
.endm
2016-09-28 04:31:48 +03:00
2018-05-22 02:00:00 +03:00
TRAMP_ R E A L _ B E G I N ( s t f _ b a r r i e r _ f a l l b a c k )
std r9 ,P A C A _ E X R F I + E X _ R 9 ( r13 )
std r10 ,P A C A _ E X R F I + E X _ R 1 0 ( r13 )
sync
ld r9 ,P A C A _ E X R F I + E X _ R 9 ( r13 )
ld r10 ,P A C A _ E X R F I + E X _ R 1 0 ( r13 )
ori 3 1 ,3 1 ,0
.rept 14
b 1 f
1 :
.endr
blr
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09 19:07:15 +03:00
TRAMP_ R E A L _ B E G I N ( r f i _ f l u s h _ f a l l b a c k )
SET_ S C R A T C H 0 ( r13 ) ;
GET_ P A C A ( r13 ) ;
powerpc/64s: Make rfi_flush_fallback a little more robust
Because rfi_flush_fallback runs immediately before the return to
userspace it currently runs with the user r1 (stack pointer). This
means if we oops in there we will report a bad kernel stack pointer in
the exception entry path, eg:
Bad kernel stack pointer 7ffff7150e40 at c0000000000023b4
Oops: Bad kernel stack pointer, sig: 6 [#1]
LE SMP NR_CPUS=32 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1246 Comm: klogd Not tainted 4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3 #7
NIP: c0000000000023b4 LR: 0000000010053e00 CTR: 0000000000000040
REGS: c0000000fffe7d40 TRAP: 4100 Not tainted (4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3)
MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000
CFAR: c00000000000bac8 IRQMASK: c0000000f1e66a80
GPR00: 0000000002000000 00007ffff7150e40 00007fff93a99900 0000000000000020
...
NIP [c0000000000023b4] rfi_flush_fallback+0x34/0x80
LR [0000000010053e00] 0x10053e00
Although the NIP tells us where we were, and the TRAP number tells us
what happened, it would still be nicer if we could report the actual
exception rather than barfing about the stack pointer.
We an do that fairly simply by loading the kernel stack pointer on
entry and restoring the user value before returning. That way we see a
regular oops such as:
Unrecoverable exception 4100 at c00000000000239c
Oops: Unrecoverable exception, sig: 6 [#1]
LE SMP NR_CPUS=32 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1251 Comm: klogd Not tainted 4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty #40
NIP: c00000000000239c LR: 0000000010053e00 CTR: 0000000000000040
REGS: c0000000f1e17bb0 TRAP: 4100 Not tainted (4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty)
MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000
CFAR: c00000000000bac8 IRQMASK: 0
...
NIP [c00000000000239c] rfi_flush_fallback+0x3c/0x80
LR [0000000010053e00] 0x10053e00
Call Trace:
[c0000000f1e17e30] [c00000000000b9e4] system_call+0x5c/0x70 (unreliable)
Note this shouldn't make the kernel stack pointer vulnerable to a
meltdown attack, because it should be flushed from the cache before we
return to userspace. The user r1 value will be in the cache, because
we load it in the return path, but that is harmless.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2018-07-26 15:42:44 +03:00
std r1 ,P A C A _ E X R F I + E X _ R 1 2 ( r13 )
ld r1 ,P A C A K S A V E ( r13 )
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09 19:07:15 +03:00
std r9 ,P A C A _ E X R F I + E X _ R 9 ( r13 )
std r10 ,P A C A _ E X R F I + E X _ R 1 0 ( r13 )
std r11 ,P A C A _ E X R F I + E X _ R 1 1 ( r13 )
mfctr r9
ld r10 ,P A C A _ R F I _ F L U S H _ F A L L B A C K _ A R E A ( r13 )
2018-01-17 16:58:18 +03:00
ld r11 ,P A C A _ L 1 D _ F L U S H _ S I Z E ( r13 )
srdi r11 ,r11 ,( 7 + 3 ) / * 1 2 8 b y t e l i n e s , u n r o l l e d 8 x * /
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09 19:07:15 +03:00
mtctr r11
2018-02-20 22:08:26 +03:00
DCBT_ B O O K 3 S _ S T O P _ A L L _ S T R E A M _ I D S ( r11 ) / * S t o p p r e f e t c h s t r e a m s * /
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09 19:07:15 +03:00
/* order ld/st prior to dcbt stop all streams with flushing */
sync
2018-01-17 16:58:18 +03:00
/ *
* The l o a d a d r e s s e s a r e a t s t a g g e r e d o f f s e t s w i t h i n c a c h e l i n e s ,
* which s u i t s s o m e p i p e l i n e s b e t t e r ( o n o t h e r s i t s h o u l d n o t
* hurt) .
* /
1 :
ld r11 ,( 0 x80 + 8 ) * 0 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 1 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 2 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 3 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 4 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 5 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 6 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 7 ( r10 )
addi r10 ,r10 ,0 x80 * 8
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09 19:07:15 +03:00
bdnz 1 b
mtctr r9
ld r9 ,P A C A _ E X R F I + E X _ R 9 ( r13 )
ld r10 ,P A C A _ E X R F I + E X _ R 1 0 ( r13 )
ld r11 ,P A C A _ E X R F I + E X _ R 1 1 ( r13 )
powerpc/64s: Make rfi_flush_fallback a little more robust
Because rfi_flush_fallback runs immediately before the return to
userspace it currently runs with the user r1 (stack pointer). This
means if we oops in there we will report a bad kernel stack pointer in
the exception entry path, eg:
Bad kernel stack pointer 7ffff7150e40 at c0000000000023b4
Oops: Bad kernel stack pointer, sig: 6 [#1]
LE SMP NR_CPUS=32 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1246 Comm: klogd Not tainted 4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3 #7
NIP: c0000000000023b4 LR: 0000000010053e00 CTR: 0000000000000040
REGS: c0000000fffe7d40 TRAP: 4100 Not tainted (4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3)
MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000
CFAR: c00000000000bac8 IRQMASK: c0000000f1e66a80
GPR00: 0000000002000000 00007ffff7150e40 00007fff93a99900 0000000000000020
...
NIP [c0000000000023b4] rfi_flush_fallback+0x34/0x80
LR [0000000010053e00] 0x10053e00
Although the NIP tells us where we were, and the TRAP number tells us
what happened, it would still be nicer if we could report the actual
exception rather than barfing about the stack pointer.
We an do that fairly simply by loading the kernel stack pointer on
entry and restoring the user value before returning. That way we see a
regular oops such as:
Unrecoverable exception 4100 at c00000000000239c
Oops: Unrecoverable exception, sig: 6 [#1]
LE SMP NR_CPUS=32 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1251 Comm: klogd Not tainted 4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty #40
NIP: c00000000000239c LR: 0000000010053e00 CTR: 0000000000000040
REGS: c0000000f1e17bb0 TRAP: 4100 Not tainted (4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty)
MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000
CFAR: c00000000000bac8 IRQMASK: 0
...
NIP [c00000000000239c] rfi_flush_fallback+0x3c/0x80
LR [0000000010053e00] 0x10053e00
Call Trace:
[c0000000f1e17e30] [c00000000000b9e4] system_call+0x5c/0x70 (unreliable)
Note this shouldn't make the kernel stack pointer vulnerable to a
meltdown attack, because it should be flushed from the cache before we
return to userspace. The user r1 value will be in the cache, because
we load it in the return path, but that is harmless.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2018-07-26 15:42:44 +03:00
ld r1 ,P A C A _ E X R F I + E X _ R 1 2 ( r13 )
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09 19:07:15 +03:00
GET_ S C R A T C H 0 ( r13 ) ;
rfid
TRAMP_ R E A L _ B E G I N ( h r f i _ f l u s h _ f a l l b a c k )
SET_ S C R A T C H 0 ( r13 ) ;
GET_ P A C A ( r13 ) ;
powerpc/64s: Make rfi_flush_fallback a little more robust
Because rfi_flush_fallback runs immediately before the return to
userspace it currently runs with the user r1 (stack pointer). This
means if we oops in there we will report a bad kernel stack pointer in
the exception entry path, eg:
Bad kernel stack pointer 7ffff7150e40 at c0000000000023b4
Oops: Bad kernel stack pointer, sig: 6 [#1]
LE SMP NR_CPUS=32 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1246 Comm: klogd Not tainted 4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3 #7
NIP: c0000000000023b4 LR: 0000000010053e00 CTR: 0000000000000040
REGS: c0000000fffe7d40 TRAP: 4100 Not tainted (4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3)
MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000
CFAR: c00000000000bac8 IRQMASK: c0000000f1e66a80
GPR00: 0000000002000000 00007ffff7150e40 00007fff93a99900 0000000000000020
...
NIP [c0000000000023b4] rfi_flush_fallback+0x34/0x80
LR [0000000010053e00] 0x10053e00
Although the NIP tells us where we were, and the TRAP number tells us
what happened, it would still be nicer if we could report the actual
exception rather than barfing about the stack pointer.
We an do that fairly simply by loading the kernel stack pointer on
entry and restoring the user value before returning. That way we see a
regular oops such as:
Unrecoverable exception 4100 at c00000000000239c
Oops: Unrecoverable exception, sig: 6 [#1]
LE SMP NR_CPUS=32 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1251 Comm: klogd Not tainted 4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty #40
NIP: c00000000000239c LR: 0000000010053e00 CTR: 0000000000000040
REGS: c0000000f1e17bb0 TRAP: 4100 Not tainted (4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty)
MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000
CFAR: c00000000000bac8 IRQMASK: 0
...
NIP [c00000000000239c] rfi_flush_fallback+0x3c/0x80
LR [0000000010053e00] 0x10053e00
Call Trace:
[c0000000f1e17e30] [c00000000000b9e4] system_call+0x5c/0x70 (unreliable)
Note this shouldn't make the kernel stack pointer vulnerable to a
meltdown attack, because it should be flushed from the cache before we
return to userspace. The user r1 value will be in the cache, because
we load it in the return path, but that is harmless.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2018-07-26 15:42:44 +03:00
std r1 ,P A C A _ E X R F I + E X _ R 1 2 ( r13 )
ld r1 ,P A C A K S A V E ( r13 )
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09 19:07:15 +03:00
std r9 ,P A C A _ E X R F I + E X _ R 9 ( r13 )
std r10 ,P A C A _ E X R F I + E X _ R 1 0 ( r13 )
std r11 ,P A C A _ E X R F I + E X _ R 1 1 ( r13 )
mfctr r9
ld r10 ,P A C A _ R F I _ F L U S H _ F A L L B A C K _ A R E A ( r13 )
2018-01-17 16:58:18 +03:00
ld r11 ,P A C A _ L 1 D _ F L U S H _ S I Z E ( r13 )
srdi r11 ,r11 ,( 7 + 3 ) / * 1 2 8 b y t e l i n e s , u n r o l l e d 8 x * /
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09 19:07:15 +03:00
mtctr r11
2018-02-20 22:08:26 +03:00
DCBT_ B O O K 3 S _ S T O P _ A L L _ S T R E A M _ I D S ( r11 ) / * S t o p p r e f e t c h s t r e a m s * /
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09 19:07:15 +03:00
/* order ld/st prior to dcbt stop all streams with flushing */
sync
2018-01-17 16:58:18 +03:00
/ *
* The l o a d a d r e s s e s a r e a t s t a g g e r e d o f f s e t s w i t h i n c a c h e l i n e s ,
* which s u i t s s o m e p i p e l i n e s b e t t e r ( o n o t h e r s i t s h o u l d n o t
* hurt) .
* /
1 :
ld r11 ,( 0 x80 + 8 ) * 0 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 1 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 2 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 3 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 4 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 5 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 6 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 7 ( r10 )
addi r10 ,r10 ,0 x80 * 8
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09 19:07:15 +03:00
bdnz 1 b
mtctr r9
ld r9 ,P A C A _ E X R F I + E X _ R 9 ( r13 )
ld r10 ,P A C A _ E X R F I + E X _ R 1 0 ( r13 )
ld r11 ,P A C A _ E X R F I + E X _ R 1 1 ( r13 )
powerpc/64s: Make rfi_flush_fallback a little more robust
Because rfi_flush_fallback runs immediately before the return to
userspace it currently runs with the user r1 (stack pointer). This
means if we oops in there we will report a bad kernel stack pointer in
the exception entry path, eg:
Bad kernel stack pointer 7ffff7150e40 at c0000000000023b4
Oops: Bad kernel stack pointer, sig: 6 [#1]
LE SMP NR_CPUS=32 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1246 Comm: klogd Not tainted 4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3 #7
NIP: c0000000000023b4 LR: 0000000010053e00 CTR: 0000000000000040
REGS: c0000000fffe7d40 TRAP: 4100 Not tainted (4.18.0-rc2-gcc-7.3.1-00175-g0443f8a69ba3)
MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000
CFAR: c00000000000bac8 IRQMASK: c0000000f1e66a80
GPR00: 0000000002000000 00007ffff7150e40 00007fff93a99900 0000000000000020
...
NIP [c0000000000023b4] rfi_flush_fallback+0x34/0x80
LR [0000000010053e00] 0x10053e00
Although the NIP tells us where we were, and the TRAP number tells us
what happened, it would still be nicer if we could report the actual
exception rather than barfing about the stack pointer.
We an do that fairly simply by loading the kernel stack pointer on
entry and restoring the user value before returning. That way we see a
regular oops such as:
Unrecoverable exception 4100 at c00000000000239c
Oops: Unrecoverable exception, sig: 6 [#1]
LE SMP NR_CPUS=32 NUMA PowerNV
Modules linked in:
CPU: 0 PID: 1251 Comm: klogd Not tainted 4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty #40
NIP: c00000000000239c LR: 0000000010053e00 CTR: 0000000000000040
REGS: c0000000f1e17bb0 TRAP: 4100 Not tainted (4.18.0-rc3-gcc-7.3.1-00097-g4ebfcac65acd-dirty)
MSR: 9000000002803031 <SF,HV,VEC,VSX,FP,ME,IR,DR,LE> CR: 44000442 XER: 20000000
CFAR: c00000000000bac8 IRQMASK: 0
...
NIP [c00000000000239c] rfi_flush_fallback+0x3c/0x80
LR [0000000010053e00] 0x10053e00
Call Trace:
[c0000000f1e17e30] [c00000000000b9e4] system_call+0x5c/0x70 (unreliable)
Note this shouldn't make the kernel stack pointer vulnerable to a
meltdown attack, because it should be flushed from the cache before we
return to userspace. The user r1 value will be in the cache, because
we load it in the return path, but that is harmless.
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Nicholas Piggin <npiggin@gmail.com>
2018-07-26 15:42:44 +03:00
ld r1 ,P A C A _ E X R F I + E X _ R 1 2 ( r13 )
powerpc/64s: Add support for RFI flush of L1-D cache
On some CPUs we can prevent the Meltdown vulnerability by flushing the
L1-D cache on exit from kernel to user mode, and from hypervisor to
guest.
This is known to be the case on at least Power7, Power8 and Power9. At
this time we do not know the status of the vulnerability on other CPUs
such as the 970 (Apple G5), pasemi CPUs (AmigaOne X1000) or Freescale
CPUs. As more information comes to light we can enable this, or other
mechanisms on those CPUs.
The vulnerability occurs when the load of an architecturally
inaccessible memory region (eg. userspace load of kernel memory) is
speculatively executed to the point where its result can influence the
address of a subsequent speculatively executed load.
In order for that to happen, the first load must hit in the L1,
because before the load is sent to the L2 the permission check is
performed. Therefore if no kernel addresses hit in the L1 the
vulnerability can not occur. We can ensure that is the case by
flushing the L1 whenever we return to userspace. Similarly for
hypervisor vs guest.
In order to flush the L1-D cache on exit, we add a section of nops at
each (h)rfi location that returns to a lower privileged context, and
patch that with some sequence. Newer firmwares are able to advertise
to us that there is a special nop instruction that flushes the L1-D.
If we do not see that advertised, we fall back to doing a displacement
flush in software.
For guest kernels we support migration between some CPU versions, and
different CPUs may use different flush instructions. So that we are
prepared to migrate to a machine with a different flush instruction
activated, we may have to patch more than one flush instruction at
boot if the hypervisor tells us to.
In the end this patch is mostly the work of Nicholas Piggin and
Michael Ellerman. However a cast of thousands contributed to analysis
of the issue, earlier versions of the patch, back ports testing etc.
Many thanks to all of them.
Tested-by: Jon Masters <jcm@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2018-01-09 19:07:15 +03:00
GET_ S C R A T C H 0 ( r13 ) ;
hrfid
2020-06-11 11:12:03 +03:00
TRAMP_ R E A L _ B E G I N ( r f s c v _ f l u s h _ f a l l b a c k )
/* system call volatile */
mr r7 ,r13
GET_ P A C A ( r13 ) ;
mr r8 ,r1
ld r1 ,P A C A K S A V E ( r13 )
mfctr r9
ld r10 ,P A C A _ R F I _ F L U S H _ F A L L B A C K _ A R E A ( r13 )
ld r11 ,P A C A _ L 1 D _ F L U S H _ S I Z E ( r13 )
srdi r11 ,r11 ,( 7 + 3 ) / * 1 2 8 b y t e l i n e s , u n r o l l e d 8 x * /
mtctr r11
DCBT_ B O O K 3 S _ S T O P _ A L L _ S T R E A M _ I D S ( r11 ) / * S t o p p r e f e t c h s t r e a m s * /
/* order ld/st prior to dcbt stop all streams with flushing */
sync
/ *
* The l o a d a d r e s s e s a r e a t s t a g g e r e d o f f s e t s w i t h i n c a c h e l i n e s ,
* which s u i t s s o m e p i p e l i n e s b e t t e r ( o n o t h e r s i t s h o u l d n o t
* hurt) .
* /
1 :
ld r11 ,( 0 x80 + 8 ) * 0 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 1 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 2 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 3 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 4 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 5 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 6 ( r10 )
ld r11 ,( 0 x80 + 8 ) * 7 ( r10 )
addi r10 ,r10 ,0 x80 * 8
bdnz 1 b
mtctr r9
li r9 ,0
li r10 ,0
li r11 ,0
mr r1 ,r8
mr r13 ,r7
RFSCV
2020-02-25 20:35:20 +03:00
USE_ T E X T _ S E C T I O N ( )
2020-02-25 20:35:27 +03:00
MASKED_ I N T E R R U P T
MASKED_ I N T E R R U P T h s r r =1
2009-06-03 01:17:38 +04:00
2013-09-20 08:52:50 +04:00
# ifdef C O N F I G _ K V M _ B O O K 3 S _ 6 4 _ H A N D L E R
2020-02-25 20:35:21 +03:00
kvmppc_skip_interrupt :
2013-09-20 08:52:50 +04:00
/ *
* Here a l l G P R s a r e u n c h a n g e d f r o m w h e n t h e i n t e r r u p t h a p p e n e d
* except f o r r13 , w h i c h i s s a v e d i n S P R G _ S C R A T C H 0 .
* /
mfspr r13 , S P R N _ S R R 0
addi r13 , r13 , 4
mtspr S P R N _ S R R 0 , r13
GET_ S C R A T C H 0 ( r13 )
2018-01-09 19:07:15 +03:00
RFI_ T O _ K E R N E L
2013-09-20 08:52:50 +04:00
b .
2020-02-25 20:35:21 +03:00
kvmppc_skip_Hinterrupt :
2013-09-20 08:52:50 +04:00
/ *
* Here a l l G P R s a r e u n c h a n g e d f r o m w h e n t h e i n t e r r u p t h a p p e n e d
* except f o r r13 , w h i c h i s s a v e d i n S P R G _ S C R A T C H 0 .
* /
mfspr r13 , S P R N _ H S R R 0
addi r13 , r13 , 4
mtspr S P R N _ H S R R 0 , r13
GET_ S C R A T C H 0 ( r13 )
2018-01-09 19:07:15 +03:00
HRFI_ T O _ K E R N E L
2013-09-20 08:52:50 +04:00
b .
# endif
2012-11-02 10:21:43 +04:00
/ *
* Relocation- o n i n t e r r u p t s : A s u b s e t o f t h e i n t e r r u p t s c a n b e d e l i v e r e d
* with I R =1 / D R =1 , i f A I L = =2 a n d M S R . H V w o n ' t b e c h a n g e d b y d e l i v e r i n g
* it. A d d r e s s e s a r e t h e s a m e a s t h e o r i g i n a l i n t e r r u p t a d d r e s s e s , b u t
* offset b y 0 x c00 0 0 0 0 0 0 0 0 0 4 0 0 0 .
* It' s i m p o s s i b l e t o r e c e i v e i n t e r r u p t s b e l o w 0 x30 0 v i a t h i s m e c h a n i s m .
* KVM : None o f t h e s e t r a p s a r e f r o m t h e g u e s t ; anything that escalated
* to H V =1 f r o m H V =0 i s d e l i v e r e d v i a r e a l m o d e h a n d l e r s .
* /
/ *
* This u s e s t h e s t a n d a r d m a c r o , s i n c e t h e o r i g i n a l 0 x30 0 v e c t o r
* only h a s e x t r a g u f f f o r S T A B - b a s e d p r o c e s s o r s - - w h i c h n e v e r
* come h e r e .
* /
2016-09-30 12:43:18 +03:00
2016-09-28 04:31:48 +03:00
EXC_ C O M M O N _ B E G I N ( p p c64 _ r u n l a t c h _ o n _ t r a m p o l i n e )
2014-02-04 09:04:35 +04:00
b _ _ p p c64 _ r u n l a t c h _ o n
2012-03-01 05:45:27 +04:00
2016-09-28 04:31:48 +03:00
USE_ F I X E D _ S E C T I O N ( v i r t _ t r a m p o l i n e s )
powerpc/book3s64: Fix branching to OOL handlers in relocatable kernel
Some of the interrupt vectors on 64-bit POWER server processors are only
32 bytes long (8 instructions), which is not enough for the full
first-level interrupt handler. For these we need to branch to an
out-of-line (OOL) handler. But when we are running a relocatable kernel,
interrupt vectors till __end_interrupts marker are copied down to real
address 0x100. So, branching to labels (ie. OOL handlers) outside this
section must be handled differently (see LOAD_HANDLER()), considering
relocatable kernel, which would need at least 4 instructions.
However, branching from interrupt vector means that we corrupt the
CFAR (come-from address register) on POWER7 and later processors as
mentioned in commit 1707dd16. So, EXCEPTION_PROLOG_0 (6 instructions)
that contains the part up to the point where the CFAR is saved in the
PACA should be part of the short interrupt vectors before we branch out
to OOL handlers.
But as mentioned already, there are interrupt vectors on 64-bit POWER
server processors that are only 32 bytes long (like vectors 0x4f00,
0x4f20, etc.), which cannot accomodate the above two cases at the same
time owing to space constraint. Currently, in these interrupt vectors,
we simply branch out to OOL handlers, without using LOAD_HANDLER(),
which leaves us vulnerable when running a relocatable kernel (eg. kdump
case). While this has been the case for sometime now and kdump is used
widely, we were fortunate not to see any problems so far, for three
reasons:
1. In almost all cases, production kernel (relocatable) is used for
kdump as well, which would mean that crashed kernel's OOL handler
would be at the same place where we end up branching to, from short
interrupt vector of kdump kernel.
2. Also, OOL handler was unlikely the reason for crash in almost all
the kdump scenarios, which meant we had a sane OOL handler from
crashed kernel that we branched to.
3. On most 64-bit POWER server processors, page size is large enough
that marking interrupt vector code as executable (see commit
429d2e83) leads to marking OOL handler code from crashed kernel,
that sits right below interrupt vector code from kdump kernel, as
executable as well.
Let us fix this by moving the __end_interrupts marker down past OOL
handlers to make sure that we also copy OOL handlers to real address
0x100 when running a relocatable kernel.
This fix has been tested successfully in kdump scenario, on an LPAR with
4K page size by using different default/production kernel and kdump
kernel.
Also tested by manually corrupting the OOL handlers in the first kernel
and then kdump'ing, and then causing the OOL handlers to fire - mpe.
Fixes: c1fb6816fb1b ("powerpc: Add relocation on exception vector handlers")
Cc: stable@vger.kernel.org
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-15 15:48:02 +03:00
/ *
2020-06-11 11:12:02 +03:00
* All c o d e b e l o w _ _ e n d _ i n t e r r u p t s i s t r e a t e d a s s o f t - m a s k e d . I f
* any c o d e r u n s h e r e w i t h M S R [ E E ] =1 , i t m u s t t h e n c o p e w i t h p e n d i n g
* soft i n t e r r u p t b e i n g r a i s e d ( i . e . , b y e n s u r i n g i t i s r e p l a y e d ) .
*
powerpc/book3s64: Fix branching to OOL handlers in relocatable kernel
Some of the interrupt vectors on 64-bit POWER server processors are only
32 bytes long (8 instructions), which is not enough for the full
first-level interrupt handler. For these we need to branch to an
out-of-line (OOL) handler. But when we are running a relocatable kernel,
interrupt vectors till __end_interrupts marker are copied down to real
address 0x100. So, branching to labels (ie. OOL handlers) outside this
section must be handled differently (see LOAD_HANDLER()), considering
relocatable kernel, which would need at least 4 instructions.
However, branching from interrupt vector means that we corrupt the
CFAR (come-from address register) on POWER7 and later processors as
mentioned in commit 1707dd16. So, EXCEPTION_PROLOG_0 (6 instructions)
that contains the part up to the point where the CFAR is saved in the
PACA should be part of the short interrupt vectors before we branch out
to OOL handlers.
But as mentioned already, there are interrupt vectors on 64-bit POWER
server processors that are only 32 bytes long (like vectors 0x4f00,
0x4f20, etc.), which cannot accomodate the above two cases at the same
time owing to space constraint. Currently, in these interrupt vectors,
we simply branch out to OOL handlers, without using LOAD_HANDLER(),
which leaves us vulnerable when running a relocatable kernel (eg. kdump
case). While this has been the case for sometime now and kdump is used
widely, we were fortunate not to see any problems so far, for three
reasons:
1. In almost all cases, production kernel (relocatable) is used for
kdump as well, which would mean that crashed kernel's OOL handler
would be at the same place where we end up branching to, from short
interrupt vector of kdump kernel.
2. Also, OOL handler was unlikely the reason for crash in almost all
the kdump scenarios, which meant we had a sane OOL handler from
crashed kernel that we branched to.
3. On most 64-bit POWER server processors, page size is large enough
that marking interrupt vector code as executable (see commit
429d2e83) leads to marking OOL handler code from crashed kernel,
that sits right below interrupt vector code from kdump kernel, as
executable as well.
Let us fix this by moving the __end_interrupts marker down past OOL
handlers to make sure that we also copy OOL handlers to real address
0x100 when running a relocatable kernel.
This fix has been tested successfully in kdump scenario, on an LPAR with
4K page size by using different default/production kernel and kdump
kernel.
Also tested by manually corrupting the OOL handlers in the first kernel
and then kdump'ing, and then causing the OOL handlers to fire - mpe.
Fixes: c1fb6816fb1b ("powerpc: Add relocation on exception vector handlers")
Cc: stable@vger.kernel.org
Signed-off-by: Hari Bathini <hbathini@linux.vnet.ibm.com>
Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2016-04-15 15:48:02 +03:00
* The _ _ e n d _ i n t e r r u p t s m a r k e r m u s t b e p a s t t h e o u t - o f - l i n e ( O O L )
* handlers, s o t h a t t h e y a r e c o p i e d t o r e a l a d d r e s s 0 x10 0 w h e n r u n n i n g
* a r e l o c a t a b l e k e r n e l . T h i s e n s u r e s t h e y c a n b e r e a c h e d f r o m t h e s h o r t
* trampoline h a n d l e r s ( l i k e 0 x4 f00 , 0 x4 f20 , e t c . ) w h i c h b r a n c h
* directly, w i t h o u t u s i n g L O A D _ H A N D L E R ( ) .
* /
.align 7
.globl __end_interrupts
__end_interrupts :
2016-09-28 04:31:48 +03:00
DEFINE_ F I X E D _ S Y M B O L ( _ _ e n d _ i n t e r r u p t s )
2013-01-10 10:44:19 +04:00
2013-03-25 05:31:31 +04:00
# ifdef C O N F I G _ P P C _ 9 7 0 _ N A P
2019-07-11 05:24:03 +03:00
/ *
* Called b y e x c e p t i o n e n t r y c o d e i f _ T L F _ N A P P I N G w a s s e t , t h i s c l e a r s
* the N A P P I N G f l a g , a n d r e d i r e c t s t h e e x c e p t i o n e x i t t o
* power4 _ f i x u p _ n a p _ r e t u r n .
* /
.globl power4_fixup_nap
2016-10-11 10:47:56 +03:00
EXC_ C O M M O N _ B E G I N ( p o w e r4 _ f i x u p _ n a p )
2013-03-25 05:31:31 +04:00
andc r9 ,r9 ,r10
std r9 ,T I _ L O C A L _ F L A G S ( r11 )
2019-07-11 05:24:03 +03:00
LOAD_ R E G _ A D D R ( r10 , p o w e r4 _ i d l e _ n a p _ r e t u r n )
std r10 ,_ N I P ( r1 )
blr
power4_idle_nap_return :
2013-03-25 05:31:31 +04:00
blr
# endif
2016-09-28 04:31:48 +03:00
CLOSE_ F I X E D _ S E C T I O N ( r e a l _ v e c t o r s ) ;
CLOSE_ F I X E D _ S E C T I O N ( r e a l _ t r a m p o l i n e s ) ;
CLOSE_ F I X E D _ S E C T I O N ( v i r t _ v e c t o r s ) ;
CLOSE_ F I X E D _ S E C T I O N ( v i r t _ t r a m p o l i n e s ) ;
USE_ T E X T _ S E C T I O N ( )
2019-08-02 13:56:38 +03:00
/* MSR[RI] should be clear because this uses SRR[01] */
enable_machine_check :
mflr r0
bcl 2 0 ,3 1 ,$ + 4
0 : mflr r3
addi r3 ,r3 ,( 1 f - 0 b )
mtspr S P R N _ S R R 0 ,r3
mfmsr r3
ori r3 ,r3 ,M S R _ M E
mtspr S P R N _ S R R 1 ,r3
RFI_ T O _ K E R N E L
1 : mtlr r0
blr
2019-08-02 13:56:39 +03:00
/* MSR[RI] should be clear because this uses SRR[01] */
disable_machine_check :
mflr r0
bcl 2 0 ,3 1 ,$ + 4
0 : mflr r3
addi r3 ,r3 ,( 1 f - 0 b )
mtspr S P R N _ S R R 0 ,r3
mfmsr r3
li r4 ,M S R _ M E
andc r3 ,r3 ,r4
mtspr S P R N _ S R R 1 ,r3
RFI_ T O _ K E R N E L
1 : mtlr r0
blr
2009-06-03 01:17:38 +04:00
/ *
* Hash t a b l e s t u f f
* /
2016-10-13 06:43:52 +03:00
.balign IFETCH_ALIGN_BYTES
2014-02-04 09:06:11 +04:00
do_hash_page :
2017-10-19 07:08:43 +03:00
# ifdef C O N F I G _ P P C _ B O O K 3 S _ 6 4
2018-01-19 04:50:40 +03:00
lis r0 ,( D S I S R _ B A D _ F A U L T _ 6 4 S | D S I S R _ D A B R M A T C H | D S I S R _ K E Y F A U L T ) @h
2017-07-19 07:49:27 +03:00
ori r0 ,r0 ,D S I S R _ B A D _ F A U L T _ 6 4 S @l
2019-08-02 13:57:01 +03:00
and. r0 ,r5 ,r0 / * w e i r d e r r o r ? * /
2009-06-03 01:17:38 +04:00
bne- h a n d l e _ p a g e _ f a u l t / * i f n o t , t r y t o i n s e r t a H P T E * /
2020-07-27 09:09:47 +03:00
/ *
* If w e a r e i n a n " N M I " ( e . g . , a n i n t e r r u p t w h e n s o f t - d i s a b l e d ) , t h e n
* don' t c a l l h a s h _ p a g e , j u s t f a i l t h e f a u l t . T h i s i s r e q u i r e d t o
* prevent r e - e n t r a n c y p r o b l e m s i n t h e h a s h c o d e , n a m e l y p e r f
* interrupts h i t t i n g w h i l e s o m e t h i n g h o l d s H _ P A G E _ B U S Y , a n d t a k i n g a
* hash f a u l t . S e e t h e c o m m e n t i n h a s h _ p r e l o a d ( ) .
* /
2019-01-12 12:55:50 +03:00
ld r11 , P A C A _ T H R E A D _ I N F O ( r13 )
2020-07-27 09:09:47 +03:00
lwz r0 ,T I _ P R E E M P T ( r11 )
andis. r0 ,r0 ,N M I _ M A S K @h
bne 7 7 f
2009-06-03 01:17:38 +04:00
/ *
2019-08-02 13:57:01 +03:00
* r3 c o n t a i n s t h e t r a p n u m b e r
* r4 c o n t a i n s t h e f a u l t i n g a d d r e s s
* r5 c o n t a i n s d s i s r
* r6 m s r
2009-06-03 01:17:38 +04:00
*
powerpc: Rework lazy-interrupt handling
The current implementation of lazy interrupts handling has some
issues that this tries to address.
We don't do the various workarounds we need to do when re-enabling
interrupts in some cases such as when returning from an interrupt
and thus we may still lose or get delayed decrementer or doorbell
interrupts.
The current scheme also makes it much harder to handle the external
"edge" interrupts provided by some BookE processors when using the
EPR facility (External Proxy) and the Freescale Hypervisor.
Additionally, we tend to keep interrupts hard disabled in a number
of cases, such as decrementer interrupts, external interrupts, or
when a masked decrementer interrupt is pending. This is sub-optimal.
This is an attempt at fixing it all in one go by reworking the way
we do the lazy interrupt disabling from the ground up.
The base idea is to replace the "hard_enabled" field with a
"irq_happened" field in which we store a bit mask of what interrupt
occurred while soft-disabled.
When re-enabling, either via arch_local_irq_restore() or when returning
from an interrupt, we can now decide what to do by testing bits in that
field.
We then implement replaying of the missed interrupts either by
re-using the existing exception frame (in exception exit case) or via
the creation of a new one from an assembly trampoline (in the
arch_local_irq_enable case).
This removes the need to play with the decrementer to try to create
fake interrupts, among others.
In addition, this adds a few refinements:
- We no longer hard disable decrementer interrupts that occur
while soft-disabled. We now simply bump the decrementer back to max
(on BookS) or leave it stopped (on BookE) and continue with hard interrupts
enabled, which means that we'll potentially get better sample quality from
performance monitor interrupts.
- Timer, decrementer and doorbell interrupts now hard-enable
shortly after removing the source of the interrupt, which means
they no longer run entirely hard disabled. Again, this will improve
perf sample quality.
- On Book3E 64-bit, we now make the performance monitor interrupt
act as an NMI like Book3S (the necessary C code for that to work
appear to already be present in the FSL perf code, notably calling
nmi_enter instead of irq_enter). (This also fixes a bug where BookE
perfmon interrupts could clobber r14 ... oops)
- We could make "masked" decrementer interrupts act as NMIs when doing
timer-based perf sampling to improve the sample quality.
Signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
v2:
- Add hard-enable to decrementer, timer and doorbells
- Fix CR clobber in masked irq handling on BookE
- Make embedded perf interrupt act as an NMI
- Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want
to retrigger an interrupt without preventing hard-enable
v3:
- Fix or vs. ori bug on Book3E
- Fix enabling of interrupts for some exceptions on Book3E
v4:
- Fix resend of doorbells on return from interrupt on Book3E
v5:
- Rebased on top of my latest series, which involves some significant
rework of some aspects of the patch.
v6:
- 32-bit compile fix
- more compile fixes with various .config combos
- factor out the asm code to soft-disable interrupts
- remove the C wrapper around preempt_schedule_irq
v7:
- Fix a bug with hard irq state tracking on native power7
2012-03-06 11:27:59 +04:00
* at r e t u r n r3 = 0 f o r s u c c e s s , 1 f o r p a g e f a u l t , n e g a t i v e f o r e r r o r
2009-06-03 01:17:38 +04:00
* /
2015-12-01 06:36:44 +03:00
bl _ _ h a s h _ p a g e / * b u i l d H P T E i f p o s s i b l e * /
cmpdi r3 ,0 / * s e e i f _ _ h a s h _ p a g e s u c c e e d e d * /
2009-06-03 01:17:38 +04:00
powerpc: Rework lazy-interrupt handling
The current implementation of lazy interrupts handling has some
issues that this tries to address.
We don't do the various workarounds we need to do when re-enabling
interrupts in some cases such as when returning from an interrupt
and thus we may still lose or get delayed decrementer or doorbell
interrupts.
The current scheme also makes it much harder to handle the external
"edge" interrupts provided by some BookE processors when using the
EPR facility (External Proxy) and the Freescale Hypervisor.
Additionally, we tend to keep interrupts hard disabled in a number
of cases, such as decrementer interrupts, external interrupts, or
when a masked decrementer interrupt is pending. This is sub-optimal.
This is an attempt at fixing it all in one go by reworking the way
we do the lazy interrupt disabling from the ground up.
The base idea is to replace the "hard_enabled" field with a
"irq_happened" field in which we store a bit mask of what interrupt
occurred while soft-disabled.
When re-enabling, either via arch_local_irq_restore() or when returning
from an interrupt, we can now decide what to do by testing bits in that
field.
We then implement replaying of the missed interrupts either by
re-using the existing exception frame (in exception exit case) or via
the creation of a new one from an assembly trampoline (in the
arch_local_irq_enable case).
This removes the need to play with the decrementer to try to create
fake interrupts, among others.
In addition, this adds a few refinements:
- We no longer hard disable decrementer interrupts that occur
while soft-disabled. We now simply bump the decrementer back to max
(on BookS) or leave it stopped (on BookE) and continue with hard interrupts
enabled, which means that we'll potentially get better sample quality from
performance monitor interrupts.
- Timer, decrementer and doorbell interrupts now hard-enable
shortly after removing the source of the interrupt, which means
they no longer run entirely hard disabled. Again, this will improve
perf sample quality.
- On Book3E 64-bit, we now make the performance monitor interrupt
act as an NMI like Book3S (the necessary C code for that to work
appear to already be present in the FSL perf code, notably calling
nmi_enter instead of irq_enter). (This also fixes a bug where BookE
perfmon interrupts could clobber r14 ... oops)
- We could make "masked" decrementer interrupts act as NMIs when doing
timer-based perf sampling to improve the sample quality.
Signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
v2:
- Add hard-enable to decrementer, timer and doorbells
- Fix CR clobber in masked irq handling on BookE
- Make embedded perf interrupt act as an NMI
- Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want
to retrigger an interrupt without preventing hard-enable
v3:
- Fix or vs. ori bug on Book3E
- Fix enabling of interrupts for some exceptions on Book3E
v4:
- Fix resend of doorbells on return from interrupt on Book3E
v5:
- Rebased on top of my latest series, which involves some significant
rework of some aspects of the patch.
v6:
- 32-bit compile fix
- more compile fixes with various .config combos
- factor out the asm code to soft-disable interrupts
- remove the C wrapper around preempt_schedule_irq
v7:
- Fix a bug with hard irq state tracking on native power7
2012-03-06 11:27:59 +04:00
/* Success */
2020-02-25 20:35:38 +03:00
beq i n t e r r u p t _ r e t u r n / * R e t u r n f r o m e x c e p t i o n o n s u c c e s s * /
2009-06-03 01:17:38 +04:00
powerpc: Rework lazy-interrupt handling
The current implementation of lazy interrupts handling has some
issues that this tries to address.
We don't do the various workarounds we need to do when re-enabling
interrupts in some cases such as when returning from an interrupt
and thus we may still lose or get delayed decrementer or doorbell
interrupts.
The current scheme also makes it much harder to handle the external
"edge" interrupts provided by some BookE processors when using the
EPR facility (External Proxy) and the Freescale Hypervisor.
Additionally, we tend to keep interrupts hard disabled in a number
of cases, such as decrementer interrupts, external interrupts, or
when a masked decrementer interrupt is pending. This is sub-optimal.
This is an attempt at fixing it all in one go by reworking the way
we do the lazy interrupt disabling from the ground up.
The base idea is to replace the "hard_enabled" field with a
"irq_happened" field in which we store a bit mask of what interrupt
occurred while soft-disabled.
When re-enabling, either via arch_local_irq_restore() or when returning
from an interrupt, we can now decide what to do by testing bits in that
field.
We then implement replaying of the missed interrupts either by
re-using the existing exception frame (in exception exit case) or via
the creation of a new one from an assembly trampoline (in the
arch_local_irq_enable case).
This removes the need to play with the decrementer to try to create
fake interrupts, among others.
In addition, this adds a few refinements:
- We no longer hard disable decrementer interrupts that occur
while soft-disabled. We now simply bump the decrementer back to max
(on BookS) or leave it stopped (on BookE) and continue with hard interrupts
enabled, which means that we'll potentially get better sample quality from
performance monitor interrupts.
- Timer, decrementer and doorbell interrupts now hard-enable
shortly after removing the source of the interrupt, which means
they no longer run entirely hard disabled. Again, this will improve
perf sample quality.
- On Book3E 64-bit, we now make the performance monitor interrupt
act as an NMI like Book3S (the necessary C code for that to work
appear to already be present in the FSL perf code, notably calling
nmi_enter instead of irq_enter). (This also fixes a bug where BookE
perfmon interrupts could clobber r14 ... oops)
- We could make "masked" decrementer interrupts act as NMIs when doing
timer-based perf sampling to improve the sample quality.
Signed-off-by-yet: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
v2:
- Add hard-enable to decrementer, timer and doorbells
- Fix CR clobber in masked irq handling on BookE
- Make embedded perf interrupt act as an NMI
- Add a PACA_HAPPENED_EE_EDGE for use by FSL if they want
to retrigger an interrupt without preventing hard-enable
v3:
- Fix or vs. ori bug on Book3E
- Fix enabling of interrupts for some exceptions on Book3E
v4:
- Fix resend of doorbells on return from interrupt on Book3E
v5:
- Rebased on top of my latest series, which involves some significant
rework of some aspects of the patch.
v6:
- 32-bit compile fix
- more compile fixes with various .config combos
- factor out the asm code to soft-disable interrupts
- remove the C wrapper around preempt_schedule_irq
v7:
- Fix a bug with hard irq state tracking on native power7
2012-03-06 11:27:59 +04:00
/* Error */
blt- 1 3 f
2017-06-13 21:42:00 +03:00
2019-08-02 13:57:01 +03:00
/* Reload DAR/DSISR into r4/r5 for the DABR check below */
ld r4 ,_ D A R ( r1 )
ld r5 ,_ D S I S R ( r1 )
2017-10-19 07:08:43 +03:00
# endif / * C O N F I G _ P P C _ B O O K 3 S _ 6 4 * /
2010-03-30 03:59:25 +04:00
2009-06-03 01:17:38 +04:00
/* Here we have a page fault that hash_page can't handle. */
handle_page_fault :
2019-08-02 13:57:01 +03:00
11 : andis. r0 ,r5 ,D S I S R _ D A B R M A T C H @h
2017-06-13 21:42:00 +03:00
bne- h a n d l e _ d a b r _ f a u l t
2009-06-03 01:17:38 +04:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
2014-02-04 09:04:35 +04:00
bl d o _ p a g e _ f a u l t
2009-06-03 01:17:38 +04:00
cmpdi r3 ,0
2020-02-25 20:35:38 +03:00
beq+ i n t e r r u p t _ r e t u r n
2009-06-03 01:17:38 +04:00
mr r5 ,r3
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
2019-08-02 13:56:42 +03:00
ld r4 ,_ D A R ( r1 )
2014-02-04 09:04:35 +04:00
bl b a d _ p a g e _ f a u l t
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2009-06-03 01:17:38 +04:00
2012-03-07 09:48:45 +04:00
/* We have a data breakpoint exception - handle it */
handle_dabr_fault :
ld r4 ,_ D A R ( r1 )
ld r5 ,_ D S I S R ( r1 )
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
2014-02-04 09:04:35 +04:00
bl d o _ b r e a k
powerpc/watchpoint: Restore NV GPRs while returning from exception
powerpc hardware triggers watchpoint before executing the instruction.
To make trigger-after-execute behavior, kernel emulates the
instruction. If the instruction is 'load something into non-volatile
register', exception handler should restore emulated register state
while returning back, otherwise there will be register state
corruption. eg, adding a watchpoint on a list can corrput the list:
# cat /proc/kallsyms | grep kthread_create_list
c00000000121c8b8 d kthread_create_list
Add watchpoint on kthread_create_list->prev:
# perf record -e mem:0xc00000000121c8c0
Run some workload such that new kthread gets invoked. eg, I just
logged out from console:
list_add corruption. next->prev should be prev (c000000001214e00), \
but was c00000000121c8b8. (next=c00000000121c8b8).
WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0
CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69
...
NIP __list_add_valid+0xb4/0xc0
LR __list_add_valid+0xb0/0xc0
Call Trace:
__list_add_valid+0xb0/0xc0 (unreliable)
__kthread_create_on_node+0xe0/0x260
kthread_create_on_node+0x34/0x50
create_worker+0xe8/0x260
worker_thread+0x444/0x560
kthread+0x160/0x1a0
ret_from_kernel_thread+0x5c/0x70
List corruption happened because it uses 'load into non-volatile
register' instruction:
Snippet from __kthread_create_on_node:
c000000000136be8: addis r29,r2,-19
c000000000136bec: ld r29,31424(r29)
if (!__list_add_valid(new, prev, next))
c000000000136bf0: mr r3,r30
c000000000136bf4: mr r5,r28
c000000000136bf8: mr r4,r29
c000000000136bfc: bl c00000000059a2f8 <__list_add_valid+0x8>
Register state from WARN_ON():
GPR00: c00000000059a3a0 c000007ff23afb50 c000000001344e00 0000000000000075
GPR04: 0000000000000000 0000000000000000 0000001852af8bc1 0000000000000000
GPR08: 0000000000000001 0000000000000007 0000000000000006 00000000000004aa
GPR12: 0000000000000000 c000007ffffeb080 c000000000137038 c000005ff62aaa00
GPR16: 0000000000000000 0000000000000000 c000007fffbe7600 c000007fffbe7370
GPR20: c000007fffbe7320 c000007fffbe7300 c000000001373a00 0000000000000000
GPR24: fffffffffffffef7 c00000000012e320 c000007ff23afcb0 c000000000cb8628
GPR28: c00000000121c8b8 c000000001214e00 c000007fef5b17e8 c000007fef5b17c0
Watchpoint hit at 0xc000000000136bec.
addis r29,r2,-19
=> r29 = 0xc000000001344e00 + (-19 << 16)
=> r29 = 0xc000000001214e00
ld r29,31424(r29)
=> r29 = *(0xc000000001214e00 + 31424)
=> r29 = *(0xc00000000121c8c0)
0xc00000000121c8c0 is where we placed a watchpoint and thus this
instruction was emulated by emulate_step. But because handle_dabr_fault
did not restore emulated register state, r29 still contains stale
value in above register state.
Fixes: 5aae8a5370802 ("powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: stable@vger.kernel.org # 2.6.36+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-13 06:30:14 +03:00
/ *
* do_ b r e a k ( ) m a y h a v e c h a n g e d t h e N V G P R S w h i l e h a n d l i n g a b r e a k p o i n t .
2020-02-25 20:35:38 +03:00
* If s o , w e n e e d t o r e s t o r e t h e m w i t h t h e i r u p d a t e d v a l u e s .
powerpc/watchpoint: Restore NV GPRs while returning from exception
powerpc hardware triggers watchpoint before executing the instruction.
To make trigger-after-execute behavior, kernel emulates the
instruction. If the instruction is 'load something into non-volatile
register', exception handler should restore emulated register state
while returning back, otherwise there will be register state
corruption. eg, adding a watchpoint on a list can corrput the list:
# cat /proc/kallsyms | grep kthread_create_list
c00000000121c8b8 d kthread_create_list
Add watchpoint on kthread_create_list->prev:
# perf record -e mem:0xc00000000121c8c0
Run some workload such that new kthread gets invoked. eg, I just
logged out from console:
list_add corruption. next->prev should be prev (c000000001214e00), \
but was c00000000121c8b8. (next=c00000000121c8b8).
WARNING: CPU: 59 PID: 309 at lib/list_debug.c:25 __list_add_valid+0xb4/0xc0
CPU: 59 PID: 309 Comm: kworker/59:0 Kdump: loaded Not tainted 5.1.0-rc7+ #69
...
NIP __list_add_valid+0xb4/0xc0
LR __list_add_valid+0xb0/0xc0
Call Trace:
__list_add_valid+0xb0/0xc0 (unreliable)
__kthread_create_on_node+0xe0/0x260
kthread_create_on_node+0x34/0x50
create_worker+0xe8/0x260
worker_thread+0x444/0x560
kthread+0x160/0x1a0
ret_from_kernel_thread+0x5c/0x70
List corruption happened because it uses 'load into non-volatile
register' instruction:
Snippet from __kthread_create_on_node:
c000000000136be8: addis r29,r2,-19
c000000000136bec: ld r29,31424(r29)
if (!__list_add_valid(new, prev, next))
c000000000136bf0: mr r3,r30
c000000000136bf4: mr r5,r28
c000000000136bf8: mr r4,r29
c000000000136bfc: bl c00000000059a2f8 <__list_add_valid+0x8>
Register state from WARN_ON():
GPR00: c00000000059a3a0 c000007ff23afb50 c000000001344e00 0000000000000075
GPR04: 0000000000000000 0000000000000000 0000001852af8bc1 0000000000000000
GPR08: 0000000000000001 0000000000000007 0000000000000006 00000000000004aa
GPR12: 0000000000000000 c000007ffffeb080 c000000000137038 c000005ff62aaa00
GPR16: 0000000000000000 0000000000000000 c000007fffbe7600 c000007fffbe7370
GPR20: c000007fffbe7320 c000007fffbe7300 c000000001373a00 0000000000000000
GPR24: fffffffffffffef7 c00000000012e320 c000007ff23afcb0 c000000000cb8628
GPR28: c00000000121c8b8 c000000001214e00 c000007fef5b17e8 c000007fef5b17c0
Watchpoint hit at 0xc000000000136bec.
addis r29,r2,-19
=> r29 = 0xc000000001344e00 + (-19 << 16)
=> r29 = 0xc000000001214e00
ld r29,31424(r29)
=> r29 = *(0xc000000001214e00 + 31424)
=> r29 = *(0xc00000000121c8c0)
0xc00000000121c8c0 is where we placed a watchpoint and thus this
instruction was emulated by emulate_step. But because handle_dabr_fault
did not restore emulated register state, r29 still contains stale
value in above register state.
Fixes: 5aae8a5370802 ("powerpc, hw_breakpoints: Implement hw_breakpoints for 64-bit server processors")
Signed-off-by: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: stable@vger.kernel.org # 2.6.36+
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
2019-06-13 06:30:14 +03:00
* /
2020-02-25 20:35:38 +03:00
REST_ N V G P R S ( r1 )
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2012-03-07 09:48:45 +04:00
2009-06-03 01:17:38 +04:00
2017-10-19 07:08:43 +03:00
# ifdef C O N F I G _ P P C _ B O O K 3 S _ 6 4
2009-06-03 01:17:38 +04:00
/ * We h a v e a p a g e f a u l t t h a t h a s h _ p a g e c o u l d h a n d l e b u t H V r e f u s e d
* the P T E i n s e r t i o n
* /
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
13 : mr r5 ,r3
2009-06-03 01:17:38 +04:00
addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
ld r4 ,_ D A R ( r1 )
2014-02-04 09:04:35 +04:00
bl l o w _ h a s h _ f a u l t
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n
2016-04-29 16:26:07 +03:00
# endif
2009-06-03 01:17:38 +04:00
powerpc: Allow perf_counters to access user memory at interrupt time
This provides a mechanism to allow the perf_counters code to access
user memory in a PMU interrupt routine. Such an access can cause
various kinds of interrupt: SLB miss, MMU hash table miss, segment
table miss, or TLB miss, depending on the processor. This commit
only deals with 64-bit classic/server processors, which use an MMU
hash table. 32-bit processors are already able to access user memory
at interrupt time. Since we don't soft-disable on 32-bit, we avoid
the possibility of reentering hash_page or the TLB miss handlers,
since they run with interrupts disabled.
On 64-bit processors, an SLB miss interrupt on a user address will
update the slb_cache and slb_cache_ptr fields in the paca. This is
OK except in the case where a PMU interrupt occurs in switch_slb,
which also accesses those fields. To prevent this, we hard-disable
interrupts in switch_slb. Interrupts are already soft-disabled at
this point, and will get hard-enabled when they get soft-enabled
later.
This also reworks slb_flush_and_rebolt: to avoid hard-disabling twice,
and to make sure that it clears the slb_cache_ptr when called from
other callers than switch_slb, the existing routine is renamed to
__slb_flush_and_rebolt, which is called by switch_slb and the new
version of slb_flush_and_rebolt.
Similarly, switch_stab (used on POWER3 and RS64 processors) gets a
hard_irq_disable() to protect the per-cpu variables used there and
in ste_allocate.
If a MMU hashtable miss interrupt occurs, normally we would call
hash_page to look up the Linux PTE for the address and create a HPTE.
However, hash_page is fairly complex and takes some locks, so to
avoid the possibility of deadlock, we check the preemption count
to see if we are in a (pseudo-)NMI handler, and if so, we don't call
hash_page but instead treat it like a bad access that will get
reported up through the exception table mechanism. An interrupt
whose handler runs even though the interrupt occurred when
soft-disabled (such as the PMU interrupt) is considered a pseudo-NMI
handler, which should use nmi_enter()/nmi_exit() rather than
irq_enter()/irq_exit().
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-08-17 09:17:54 +04:00
/ *
* We c o m e h e r e a s a r e s u l t o f a D S I a t a p o i n t w h e r e w e d o n ' t w a n t
* to c a l l h a s h _ p a g e , s u c h a s w h e n w e a r e a c c e s s i n g m e m o r y ( p o s s i b l y
* user m e m o r y ) i n s i d e a P M U i n t e r r u p t t h a t o c c u r r e d w h i l e i n t e r r u p t s
* were s o f t - d i s a b l e d . W e w a n t t o i n v o k e t h e e x c e p t i o n h a n d l e r f o r
* the a c c e s s , o r p a n i c i f t h e r e i s n ' t a h a n d l e r .
* /
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
77 : addi r3 ,r1 ,S T A C K _ F R A M E _ O V E R H E A D
powerpc: Allow perf_counters to access user memory at interrupt time
This provides a mechanism to allow the perf_counters code to access
user memory in a PMU interrupt routine. Such an access can cause
various kinds of interrupt: SLB miss, MMU hash table miss, segment
table miss, or TLB miss, depending on the processor. This commit
only deals with 64-bit classic/server processors, which use an MMU
hash table. 32-bit processors are already able to access user memory
at interrupt time. Since we don't soft-disable on 32-bit, we avoid
the possibility of reentering hash_page or the TLB miss handlers,
since they run with interrupts disabled.
On 64-bit processors, an SLB miss interrupt on a user address will
update the slb_cache and slb_cache_ptr fields in the paca. This is
OK except in the case where a PMU interrupt occurs in switch_slb,
which also accesses those fields. To prevent this, we hard-disable
interrupts in switch_slb. Interrupts are already soft-disabled at
this point, and will get hard-enabled when they get soft-enabled
later.
This also reworks slb_flush_and_rebolt: to avoid hard-disabling twice,
and to make sure that it clears the slb_cache_ptr when called from
other callers than switch_slb, the existing routine is renamed to
__slb_flush_and_rebolt, which is called by switch_slb and the new
version of slb_flush_and_rebolt.
Similarly, switch_stab (used on POWER3 and RS64 processors) gets a
hard_irq_disable() to protect the per-cpu variables used there and
in ste_allocate.
If a MMU hashtable miss interrupt occurs, normally we would call
hash_page to look up the Linux PTE for the address and create a HPTE.
However, hash_page is fairly complex and takes some locks, so to
avoid the possibility of deadlock, we check the preemption count
to see if we are in a (pseudo-)NMI handler, and if so, we don't call
hash_page but instead treat it like a bad access that will get
reported up through the exception table mechanism. An interrupt
whose handler runs even though the interrupt occurred when
soft-disabled (such as the PMU interrupt) is considered a pseudo-NMI
handler, which should use nmi_enter()/nmi_exit() rather than
irq_enter()/irq_exit().
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Paul Mackerras <paulus@samba.org>
2009-08-17 09:17:54 +04:00
li r5 ,S I G S E G V
2014-02-04 09:04:35 +04:00
bl b a d _ p a g e _ f a u l t
powerpc/64s: Implement interrupt exit logic in C
Implement the bulk of interrupt return logic in C. The asm return code
must handle a few cases: restoring full GPRs, and emulating stack
store.
The stack store emulation is significantly simplfied, rather than
creating a new return frame and switching to that before performing
the store, it uses the PACA to keep a scratch register around to
perform the store.
The asm return code is moved into 64e for now. The new logic has made
allowance for 64e, but I don't have a full environment that works well
to test it, and even booting in emulated qemu is not great for stress
testing. 64e shouldn't be too far off working with this, given a bit
more testing and auditing of the logic.
This is slightly faster on a POWER9 (page fault speed increases about
1.1%), probably due to reduced mtmsrd.
mpe: Includes fixes from Nick for _TIF_EMULATE_STACK_STORE
handling (including the fast_interrupt_return path), to remove
trace_hardirqs_on(), and fixes the interrupt-return part of the
MSR_VSX restore bug caught by tm-unavailable selftest.
mpe: Incorporate fix from Nick:
The return-to-kernel path has to replay any soft-pending interrupts if
it is returning to a context that had interrupts soft-enabled. It has
to do this carefully and avoid plain enabling interrupts if this is an
irq context, which can cause multiple nesting of interrupts on the
stack, and other unexpected issues.
The code which avoided this case got the soft-mask state wrong, and
marked interrupts as enabled before going around again to retry. This
seems to be mostly harmless except when PREEMPT=y, this calls
preempt_schedule_irq with irqs apparently enabled and runs into a BUG
in kernel/sched/core.c
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200225173541.1549955-29-npiggin@gmail.com
2020-02-25 20:35:37 +03:00
b i n t e r r u p t _ r e t u r n