KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
/* SPDX-License-Identifier: GPL-2.0 */
# include < l i n u x / l i n k a g e . h >
# include < a s m / a s m . h >
2022-09-15 13:11:27 +02:00
# include < a s m / a s m - o f f s e t s . h >
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
# include < a s m / b i t s p e r l o n g . h >
# include < a s m / k v m _ v c p u _ r e g s . h >
2020-04-13 03:17:58 -04:00
# include < a s m / n o s p e c - b r a n c h . h >
2022-09-30 14:14:44 -04:00
# include " k v m - a s m - o f f s e t s . h "
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
# define W O R D _ S I Z E ( B I T S _ P E R _ L O N G / 8 )
/* Intentionally omit RAX as it's context switched by hardware */
2022-09-30 14:14:44 -04:00
# define V C P U _ R C X ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R C X * W O R D _ S I Z E )
# define V C P U _ R D X ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R D X * W O R D _ S I Z E )
# define V C P U _ R B X ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R B X * W O R D _ S I Z E )
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
/* Intentionally omit RSP as it's context switched by hardware */
2022-09-30 14:14:44 -04:00
# define V C P U _ R B P ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R B P * W O R D _ S I Z E )
# define V C P U _ R S I ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R S I * W O R D _ S I Z E )
# define V C P U _ R D I ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R D I * W O R D _ S I Z E )
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
# ifdef C O N F I G _ X 8 6 _ 6 4
2022-09-30 14:14:44 -04:00
# define V C P U _ R 8 ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R 8 * W O R D _ S I Z E )
# define V C P U _ R 9 ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R 9 * W O R D _ S I Z E )
# define V C P U _ R 1 0 ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R 1 0 * W O R D _ S I Z E )
# define V C P U _ R 1 1 ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R 1 1 * W O R D _ S I Z E )
# define V C P U _ R 1 2 ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R 1 2 * W O R D _ S I Z E )
# define V C P U _ R 1 3 ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R 1 3 * W O R D _ S I Z E )
# define V C P U _ R 1 4 ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R 1 4 * W O R D _ S I Z E )
# define V C P U _ R 1 5 ( S V M _ v c p u _ a r c h _ r e g s + _ _ V C P U _ R E G S _ R 1 5 * W O R D _ S I Z E )
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
# endif
2022-11-07 05:14:27 -05:00
# define S V M _ v m c b01 _ p a ( S V M _ v m c b01 + K V M _ V M C B _ p a )
2020-07-08 21:51:58 +02:00
.section .noinstr .text , " ax"
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
.macro RESTORE_GUEST_SPEC_CTRL
/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
ALTERNATIVE_ 2 " " , \
" jmp 8 0 0 f " , X 8 6 _ F E A T U R E _ M S R _ S P E C _ C T R L , \
" " , X8 6 _ F E A T U R E _ V _ S P E C _ C T R L
801 :
.endm
.macro RESTORE_GUEST_SPEC_CTRL_BODY
800 :
/ *
* SPEC_ C T R L h a n d l i n g : i f t h e g u e s t ' s S P E C _ C T R L v a l u e d i f f e r s f r o m t h e
* host' s , w r i t e t h e M S R . T h i s i s k e p t o u t - o f - l i n e s o t h a t t h e c o m m o n
* case d o e s n o t h a v e t o j u m p .
*
* IMPORTANT : To a v o i d R S B u n d e r f l o w a t t a c k s a n d a n y o t h e r n a s t i n e s s ,
* there m u s t n o t b e a n y r e t u r n s o r i n d i r e c t b r a n c h e s b e t w e e n t h i s c o d e
* and v m e n t r y .
* /
movl S V M _ s p e c _ c t r l ( % _ A S M _ D I ) , % e a x
cmp P E R _ C P U _ V A R ( x86 _ s p e c _ c t r l _ c u r r e n t ) , % e a x
je 8 0 1 b
mov $ M S R _ I A 3 2 _ S P E C _ C T R L , % e c x
xor % e d x , % e d x
wrmsr
jmp 8 0 1 b
.endm
.macro RESTORE_HOST_SPEC_CTRL
/* No need to do anything if SPEC_CTRL is unset or V_SPEC_CTRL is set */
ALTERNATIVE_ 2 " " , \
" jmp 9 0 0 f " , X 8 6 _ F E A T U R E _ M S R _ S P E C _ C T R L , \
" " , X8 6 _ F E A T U R E _ V _ S P E C _ C T R L
901 :
.endm
.macro RESTORE_HOST_SPEC_CTRL_BODY
900 :
/* Same for after vmexit. */
mov $ M S R _ I A 3 2 _ S P E C _ C T R L , % e c x
/ *
* Load t h e v a l u e t h a t t h e g u e s t h a d w r i t t e n i n t o M S R _ I A 3 2 _ S P E C _ C T R L ,
* if i t w a s n o t i n t e r c e p t e d d u r i n g g u e s t e x e c u t i o n .
* /
cmpb $ 0 , ( % _ A S M _ S P )
jnz 9 9 8 f
rdmsr
movl % e a x , S V M _ s p e c _ c t r l ( % _ A S M _ D I )
998 :
/* Now restore the host value of the MSR if different from the guest's. */
movl P E R _ C P U _ V A R ( x86 _ s p e c _ c t r l _ c u r r e n t ) , % e a x
cmp S V M _ s p e c _ c t r l ( % _ A S M _ D I ) , % e a x
je 9 0 1 b
xor % e d x , % e d x
wrmsr
jmp 9 0 1 b
.endm
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
/ * *
* _ _ svm_ v c p u _ r u n - R u n a v C P U v i a a t r a n s i t i o n t o S V M g u e s t m o d e
2022-09-30 14:14:44 -04:00
* @svm: struct vcpu_svm *
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
* @spec_ctrl_intercepted: bool
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
* /
SYM_ F U N C _ S T A R T ( _ _ s v m _ v c p u _ r u n )
push % _ A S M _ B P
# ifdef C O N F I G _ X 8 6 _ 6 4
push % r15
push % r14
push % r13
push % r12
# else
push % e d i
push % e s i
# endif
push % _ A S M _ B X
2022-11-07 03:49:59 -05:00
/ *
* Save v a r i a b l e s n e e d e d a f t e r v m e x i t o n t h e s t a c k , i n i n v e r s e
* order c o m p a r e d t o w h e n t h e y a r e n e e d e d .
* /
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
/* Accessed directly from the stack in RESTORE_HOST_SPEC_CTRL. */
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
push % _ A S M _ A R G 2
2022-11-07 03:49:59 -05:00
/* Needed to restore access to percpu variables. */
_ _ ASM_ S I Z E ( p u s h ) P E R _ C P U _ V A R ( s v m _ d a t a + S D _ s a v e _ a r e a _ p a )
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
/* Finally save @svm. */
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
push % _ A S M _ A R G 1
2022-11-07 04:17:29 -05:00
.ifnc _ ASM_ A R G 1 , _ A S M _ D I
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
/ *
* Stash @svm in RDI early. On 32-bit, arguments are in RAX, RCX
* and R D X w h i c h a r e c l o b b e r e d b y R E S T O R E _ G U E S T _ S P E C _ C T R L .
* /
2022-11-07 04:17:29 -05:00
mov % _ A S M _ A R G 1 , % _ A S M _ D I
.endif
2022-10-28 17:30:07 -04:00
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
/* Clobbers RAX, RCX, RDX. */
RESTORE_ G U E S T _ S P E C _ C T R L
2022-11-07 05:14:27 -05:00
/ *
* Use a s i n g l e v m c b ( v m c b01 b e c a u s e i t ' s a l w a y s v a l i d ) f o r
* context s w i t c h i n g g u e s t s t a t e v i a V M L O A D / V M S A V E , t h a t w a y
* the s t a t e d o e s n ' t n e e d t o b e c o p i e d b e t w e e n v m c b01 a n d
* vmcb0 2 w h e n s w i t c h i n g v m c b s f o r n e s t e d v i r t u a l i z a t i o n .
* /
mov S V M _ v m c b01 _ p a ( % _ A S M _ D I ) , % _ A S M _ A X
1 : vmload % _ A S M _ A X
2 :
2022-11-07 04:17:29 -05:00
/* Get svm->current_vmcb->pa into RAX. */
mov S V M _ c u r r e n t _ v m c b ( % _ A S M _ D I ) , % _ A S M _ A X
mov K V M _ V M C B _ p a ( % _ A S M _ A X ) , % _ A S M _ A X
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
/* Load guest registers. */
2022-10-28 17:30:07 -04:00
mov V C P U _ R C X ( % _ A S M _ D I ) , % _ A S M _ C X
mov V C P U _ R D X ( % _ A S M _ D I ) , % _ A S M _ D X
mov V C P U _ R B X ( % _ A S M _ D I ) , % _ A S M _ B X
mov V C P U _ R B P ( % _ A S M _ D I ) , % _ A S M _ B P
mov V C P U _ R S I ( % _ A S M _ D I ) , % _ A S M _ S I
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
# ifdef C O N F I G _ X 8 6 _ 6 4
2022-10-28 17:30:07 -04:00
mov V C P U _ R 8 ( % _ A S M _ D I ) , % r8
mov V C P U _ R 9 ( % _ A S M _ D I ) , % r9
mov V C P U _ R 1 0 ( % _ A S M _ D I ) , % r10
mov V C P U _ R 1 1 ( % _ A S M _ D I ) , % r11
mov V C P U _ R 1 2 ( % _ A S M _ D I ) , % r12
mov V C P U _ R 1 3 ( % _ A S M _ D I ) , % r13
mov V C P U _ R 1 4 ( % _ A S M _ D I ) , % r14
mov V C P U _ R 1 5 ( % _ A S M _ D I ) , % r15
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
# endif
2022-10-28 17:30:07 -04:00
mov V C P U _ R D I ( % _ A S M _ D I ) , % _ A S M _ D I
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
/* Enter guest mode */
2020-04-13 03:17:58 -04:00
sti
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
2022-11-07 05:14:27 -05:00
3 : vmrun % _ A S M _ A X
4 :
cli
2020-04-13 03:17:58 -04:00
2022-11-07 05:14:27 -05:00
/* Pop @svm to RAX while it's the only available register. */
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
pop % _ A S M _ A X
/* Save all guest registers. */
mov % _ A S M _ C X , V C P U _ R C X ( % _ A S M _ A X )
mov % _ A S M _ D X , V C P U _ R D X ( % _ A S M _ A X )
mov % _ A S M _ B X , V C P U _ R B X ( % _ A S M _ A X )
mov % _ A S M _ B P , V C P U _ R B P ( % _ A S M _ A X )
mov % _ A S M _ S I , V C P U _ R S I ( % _ A S M _ A X )
mov % _ A S M _ D I , V C P U _ R D I ( % _ A S M _ A X )
# ifdef C O N F I G _ X 8 6 _ 6 4
mov % r8 , V C P U _ R 8 ( % _ A S M _ A X )
mov % r9 , V C P U _ R 9 ( % _ A S M _ A X )
mov % r10 , V C P U _ R 1 0 ( % _ A S M _ A X )
mov % r11 , V C P U _ R 1 1 ( % _ A S M _ A X )
mov % r12 , V C P U _ R 1 2 ( % _ A S M _ A X )
mov % r13 , V C P U _ R 1 3 ( % _ A S M _ A X )
mov % r14 , V C P U _ R 1 4 ( % _ A S M _ A X )
mov % r15 , V C P U _ R 1 5 ( % _ A S M _ A X )
# endif
2022-11-07 05:14:27 -05:00
/* @svm can stay in RDI from now on. */
mov % _ A S M _ A X , % _ A S M _ D I
mov S V M _ v m c b01 _ p a ( % _ A S M _ D I ) , % _ A S M _ A X
5 : vmsave % _ A S M _ A X
6 :
2022-11-07 03:49:59 -05:00
/* Restores GSBASE among other things, allowing access to percpu data. */
pop % _ A S M _ A X
7 : vmload % _ A S M _ A X
8 :
2022-11-07 05:14:27 -05:00
# ifdef C O N F I G _ R E T P O L I N E
/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
FILL_ R E T U R N _ B U F F E R % _ A S M _ A X , R S B _ C L E A R _ L O O P S , X 8 6 _ F E A T U R E _ R E T P O L I N E
# endif
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
/* Clobbers RAX, RCX, RDX. */
RESTORE_ H O S T _ S P E C _ C T R L
2022-06-14 23:15:48 +02:00
/ *
* Mitigate R E T B l e e d f o r A M D / H y g o n Z e n u a r c h . R E T s h o u l d b e
* untrained a s s o o n a s w e e x i t t h e V M a n d a r e b a c k t o t h e
* kernel. T h i s s h o u l d b e d o n e b e f o r e r e - e n a b l i n g i n t e r r u p t s
* because i n t e r r u p t h a n d l e r s w o n ' t s a n i t i z e ' r e t ' i f t h e r e t u r n i s
* from t h e k e r n e l .
* /
UNTRAIN_ R E T
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
/ *
* Clear a l l g e n e r a l p u r p o s e r e g i s t e r s e x c e p t R S P a n d R A X t o p r e v e n t
* speculative u s e o f t h e g u e s t ' s v a l u e s , e v e n t h o s e t h a t a r e r e l o a d e d
* via t h e s t a c k . I n t h e o r y , a n L 1 c a c h e m i s s w h e n r e s t o r i n g r e g i s t e r s
* could l e a d t o s p e c u l a t i v e e x e c u t i o n w i t h t h e g u e s t ' s v a l u e s .
* Zeroing X O R s a r e d i r t c h e a p , i . e . t h e e x t r a p a r a n o i a i s e s s e n t i a l l y
* free. R S P a n d R A X a r e e x e m p t a s t h e y a r e r e s t o r e d b y h a r d w a r e
* during V M - E x i t .
* /
xor % e c x , % e c x
xor % e d x , % e d x
xor % e b x , % e b x
xor % e b p , % e b p
xor % e s i , % e s i
xor % e d i , % e d i
# ifdef C O N F I G _ X 8 6 _ 6 4
xor % r8 d , % r8 d
xor % r9 d , % r9 d
xor % r10 d , % r10 d
xor % r11 d , % r11 d
xor % r12 d , % r12 d
xor % r13 d , % r13 d
xor % r14 d , % r14 d
xor % r15 d , % r15 d
# endif
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
/* "Pop" @spec_ctrl_intercepted. */
pop % _ A S M _ B X
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
pop % _ A S M _ B X
# ifdef C O N F I G _ X 8 6 _ 6 4
pop % r12
pop % r13
pop % r14
pop % r15
# else
pop % e s i
pop % e d i
# endif
pop % _ A S M _ B P
2021-12-04 14:43:40 +01:00
RET
2021-02-26 13:56:21 +01:00
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
RESTORE_ G U E S T _ S P E C _ C T R L _ B O D Y
RESTORE_ H O S T _ S P E C _ C T R L _ B O D Y
2022-11-07 05:14:27 -05:00
10 : cmpb $ 0 , k v m _ r e b o o t i n g
2021-02-26 13:56:21 +01:00
jne 2 b
ud2
2022-11-07 05:14:27 -05:00
30 : cmpb $ 0 , k v m _ r e b o o t i n g
jne 4 b
ud2
50 : cmpb $ 0 , k v m _ r e b o o t i n g
jne 6 b
ud2
2022-11-07 03:49:59 -05:00
70 : cmpb $ 0 , k v m _ r e b o o t i n g
jne 8 b
ud2
2021-02-26 13:56:21 +01:00
2022-11-07 05:14:27 -05:00
_ ASM_ E X T A B L E ( 1 b , 1 0 b )
_ ASM_ E X T A B L E ( 3 b , 3 0 b )
_ ASM_ E X T A B L E ( 5 b , 5 0 b )
2022-11-07 03:49:59 -05:00
_ ASM_ E X T A B L E ( 7 b , 7 0 b )
2021-02-26 13:56:21 +01:00
KVM: SVM: Split svm_vcpu_run inline assembly to separate file
The compiler (GCC) does not like the situation, where there is inline
assembly block that clobbers all available machine registers in the
middle of the function. This situation can be found in function
svm_vcpu_run in file kvm/svm.c and results in many register spills and
fills to/from stack frame.
This patch fixes the issue with the same approach as was done for
VMX some time ago. The big inline assembly is moved to a separate
assembly .S file, taking into account all ABI requirements.
There are two main benefits of the above approach:
* elimination of several register spills and fills to/from stack
frame, and consequently smaller function .text size. The binary size
of svm_vcpu_run is lowered from 2019 to 1626 bytes.
* more efficient access to a register save array. Currently, register
save array is accessed as:
7b00: 48 8b 98 28 02 00 00 mov 0x228(%rax),%rbx
7b07: 48 8b 88 18 02 00 00 mov 0x218(%rax),%rcx
7b0e: 48 8b 90 20 02 00 00 mov 0x220(%rax),%rdx
and passing ia pointer to a register array as an argument to a function one gets:
12: 48 8b 48 08 mov 0x8(%rax),%rcx
16: 48 8b 50 10 mov 0x10(%rax),%rdx
1a: 48 8b 58 18 mov 0x18(%rax),%rbx
As a result, the total size, considering that the new function size is 229
bytes, gets lowered by 164 bytes.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-30 15:02:13 +02:00
SYM_ F U N C _ E N D ( _ _ s v m _ v c p u _ r u n )
2020-12-10 11:10:08 -06:00
/ * *
* _ _ svm_ s e v _ e s _ v c p u _ r u n - R u n a S E V - E S v C P U v i a a t r a n s i t i o n t o S V M g u e s t m o d e
2022-11-07 04:17:29 -05:00
* @svm: struct vcpu_svm *
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
* @spec_ctrl_intercepted: bool
2020-12-10 11:10:08 -06:00
* /
SYM_ F U N C _ S T A R T ( _ _ s v m _ s e v _ e s _ v c p u _ r u n )
push % _ A S M _ B P
# ifdef C O N F I G _ X 8 6 _ 6 4
push % r15
push % r14
push % r13
push % r12
# else
push % e d i
push % e s i
# endif
push % _ A S M _ B X
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
/ *
* Save v a r i a b l e s n e e d e d a f t e r v m e x i t o n t h e s t a c k , i n i n v e r s e
* order c o m p a r e d t o w h e n t h e y a r e n e e d e d .
* /
/* Accessed directly from the stack in RESTORE_HOST_SPEC_CTRL. */
push % _ A S M _ A R G 2
/* Save @svm. */
push % _ A S M _ A R G 1
.ifnc _ ASM_ A R G 1 , _ A S M _ D I
/ *
* Stash @svm in RDI early. On 32-bit, arguments are in RAX, RCX
* and R D X w h i c h a r e c l o b b e r e d b y R E S T O R E _ G U E S T _ S P E C _ C T R L .
* /
mov % _ A S M _ A R G 1 , % _ A S M _ D I
.endif
/* Clobbers RAX, RCX, RDX. */
RESTORE_ G U E S T _ S P E C _ C T R L
2022-11-07 04:17:29 -05:00
/* Get svm->current_vmcb->pa into RAX. */
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
mov S V M _ c u r r e n t _ v m c b ( % _ A S M _ D I ) , % _ A S M _ A X
2022-11-07 04:17:29 -05:00
mov K V M _ V M C B _ p a ( % _ A S M _ A X ) , % _ A S M _ A X
2021-02-26 13:56:21 +01:00
/* Enter guest mode */
2020-12-10 11:10:08 -06:00
sti
1 : vmrun % _ A S M _ A X
2021-02-26 13:56:21 +01:00
2 : cli
2020-12-10 11:10:08 -06:00
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
/* Pop @svm to RDI, guest registers have been saved already. */
pop % _ A S M _ D I
2020-12-10 11:10:08 -06:00
# ifdef C O N F I G _ R E T P O L I N E
/* IMPORTANT: Stuff the RSB immediately after VM-Exit, before RET! */
FILL_ R E T U R N _ B U F F E R % _ A S M _ A X , R S B _ C L E A R _ L O O P S , X 8 6 _ F E A T U R E _ R E T P O L I N E
# endif
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
/* Clobbers RAX, RCX, RDX. */
RESTORE_ H O S T _ S P E C _ C T R L
2022-06-14 23:15:48 +02:00
/ *
* Mitigate R E T B l e e d f o r A M D / H y g o n Z e n u a r c h . R E T s h o u l d b e
* untrained a s s o o n a s w e e x i t t h e V M a n d a r e b a c k t o t h e
* kernel. T h i s s h o u l d b e d o n e b e f o r e r e - e n a b l i n g i n t e r r u p t s
* because i n t e r r u p t h a n d l e r s w o n ' t s a n i t i z e R E T i f t h e r e t u r n i s
* from t h e k e r n e l .
* /
UNTRAIN_ R E T
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
/* "Pop" @spec_ctrl_intercepted. */
pop % _ A S M _ B X
2020-12-10 11:10:08 -06:00
pop % _ A S M _ B X
# ifdef C O N F I G _ X 8 6 _ 6 4
pop % r12
pop % r13
pop % r14
pop % r15
# else
pop % e s i
pop % e d i
# endif
pop % _ A S M _ B P
2021-12-04 14:43:40 +01:00
RET
2021-02-26 13:56:21 +01:00
KVM: SVM: move MSR_IA32_SPEC_CTRL save/restore to assembly
Restoration of the host IA32_SPEC_CTRL value is probably too late
with respect to the return thunk training sequence.
With respect to the user/kernel boundary, AMD says, "If software chooses
to toggle STIBP (e.g., set STIBP on kernel entry, and clear it on kernel
exit), software should set STIBP to 1 before executing the return thunk
training sequence." I assume the same requirements apply to the guest/host
boundary. The return thunk training sequence is in vmenter.S, quite close
to the VM-exit. On hosts without V_SPEC_CTRL, however, the host's
IA32_SPEC_CTRL value is not restored until much later.
To avoid this, move the restoration of host SPEC_CTRL to assembly and,
for consistency, move the restoration of the guest SPEC_CTRL as well.
This is not particularly difficult, apart from some care to cover both
32- and 64-bit, and to share code between SEV-ES and normal vmentry.
Cc: stable@vger.kernel.org
Fixes: a149180fbcf3 ("x86: Add magic AMD return-thunk")
Suggested-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-09-30 14:24:40 -04:00
RESTORE_ G U E S T _ S P E C _ C T R L _ B O D Y
RESTORE_ H O S T _ S P E C _ C T R L _ B O D Y
2021-02-26 13:56:21 +01:00
3 : cmpb $ 0 , k v m _ r e b o o t i n g
jne 2 b
ud2
_ ASM_ E X T A B L E ( 1 b , 3 b )
2020-12-10 11:10:08 -06:00
SYM_ F U N C _ E N D ( _ _ s v m _ s e v _ e s _ v c p u _ r u n )