|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH 2/2] x86/svm: Keep the RAS balanced for guests
One source of lost performance was that fact that to protect Xen from a
malicious guests, we had to flush the RAS.
It turns out that CET Shadow Stacks give us enough architectural guarantees to
construct a lower overhead mitigation, which keeps the RAS balanced for the
guest so their return performance is still good.
To keep the RAS balanced, Xen must execute the same number of CALLs as RETs
across one VMexit->VMEntry. Without CET-SS, we could achieve this fairly
easily with a `call; add $8, %rsp` and `push; ret` pair, but this is not legal
under CET-SS. In fact, CALL is the only shadow stack "push" operation we
have, and we can't use it a second time if we intend to keep the RAS balanced.
Instead, we keep a real return address on the stack. This means that for some
of entry.S, %rsp conditionally doesn't reference CPUINFO.
This necessitates swapping the current order of DO_OVERWRITE_RSB and
svm_vmexit_spec_ctrl; while they don't have any specific ordering
requirements, push_one_ras needs to come after svm_vmexit_spec_ctrl or else we
need some very invasive changes to fix up the %rsp changes.
Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
---
CC: Jan Beulich <JBeulich@xxxxxxxx>
CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
CC: Wei Liu <wl@xxxxxxx>
RFC for a couple of reasons. This does function correctly, but I still want
to do more perf testing.
Secondly, X86_FEATURE_ALWAYS is clearly not ok for committing. I'm still
debating whether to make this construct available in !CET-SS cases.
Mechanically, its fine, but the safety arguments depend on CET-SS being
active.
In principle, on CPUs which do not suffer Branch Type Confusion, you might be
able to reason a defence-in-depth argument that if an attacker can't control
indirect speculation, then they can't bypass the 1-stuff safety either, but
the only AMD CPUs not vulnerable to BTC have CET-SS anyway.
Third, I'd like some early feedback on how clear it the logic is given the
conditional nature of %rsp not referencing CPUINFO.
Fourth, the alternatives logic (I think) needs improving to not fix up a
direct CALL/JMP displacement if the destination is within the replacement
length. I did the functional testing before wrapping things in alternatives.
---
xen/arch/x86/hvm/svm/entry.S | 55 ++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 53 insertions(+), 2 deletions(-)
diff --git a/xen/arch/x86/hvm/svm/entry.S b/xen/arch/x86/hvm/svm/entry.S
index be4ce52bd81d..98934db41fec 100644
--- a/xen/arch/x86/hvm/svm/entry.S
+++ b/xen/arch/x86/hvm/svm/entry.S
@@ -22,8 +22,41 @@
#include <asm/asm_defns.h>
#include <asm/page.h>
+.macro push_one_ras
+ /*
+ * Pushes one entry into the RAS, then updates the return address(es)
+ * to point at svm_ras_speculation_trap.
+ *
+ * Rogue RAS-speculation will hit the INT3 and stop. Architectural
+ * execution will go to svm_ras_speculation_trap.
+ *
+ * This deliberately leaves the (modified) return address on the
+ * stack(s).
+ */
+ call 1f
+ int3
+1:
+ lea svm_ras_speculation_trap(%rip), %rax
+
+#ifdef CONFIG_XEN_SHSTK
+ rdsspq %rcx
+ wrssq %rax, (%rcx)
+#endif
+ mov %rax, (%rsp)
+.endm
+
ENTRY(svm_asm_do_resume)
GET_CURRENT(bx)
+
+ /*
+ * We've just been schedule()'d. There's no speculation safety needed
+ * here, but we do need to set the stack up in the manner expected by
+ * later logic.
+ */
+ ALTERNATIVE "", push_one_ras, X86_FEATURE_ALWAYS
+
+ /* WARNING! After this point, %rsp /may/ not reference cpu_info. */
+
.Lsvm_do_resume:
call svm_intr_assist
call nsvm_vcpu_switch
@@ -56,6 +89,20 @@ __UNLIKELY_END(nsvm_hap)
clgi
/* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */
+ /* WARNING! Before this point, %rsp /may/ not reference cpu_info. */
+
+ /*
+ * If we're trying to balance the RAS for guests, push_one_ras in the
+ * VMExit path was necessary for speculative safety, but the on-stack
+ * return address was deliberately updated to point here.
+ *
+ * We execute one RET to re-balance the RAS. It will mispredict (to
+ * the INT3 in push_one_ras in the general case), but won't
+ * architecturally change the instruction flow.
+ */
+ ALTERNATIVE "", ret, X86_FEATURE_ALWAYS
+svm_ras_speculation_trap:
+
/* SPEC_CTRL_EXIT_TO_SVM Req: b=curr %rsp=regs/cpuinfo, Clob:
acd */
.macro svm_vmentry_spec_ctrl
mov VCPU_arch_msrs(%rbx), %rax
@@ -108,8 +155,6 @@ __UNLIKELY_END(nsvm_hap)
.endm
ALTERNATIVE "", svm_vmexit_cond_ibpb, X86_FEATURE_IBPB_ENTRY_HVM
- ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_SC_RSB_HVM
-
.macro svm_vmexit_spec_ctrl
movzbl CPUINFO_xen_spec_ctrl(%rsp), %eax
movzbl CPUINFO_last_spec_ctrl(%rsp), %edx
@@ -122,6 +167,12 @@ __UNLIKELY_END(nsvm_hap)
1:
.endm
ALTERNATIVE "", svm_vmexit_spec_ctrl, X86_FEATURE_SC_MSR_HVM
+
+ ALTERNATIVE_2 "", \
+ DO_OVERWRITE_RSB, X86_FEATURE_SC_RSB_HVM, \
+ push_one_ras, X86_FEATURE_ALWAYS
+
+ /* WARNING! After this point, %rsp /may/ not reference cpu_info. */
/* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
/*
--
2.11.0
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |