[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 2/2] x86/svm: Keep the RAS balanced for guests


  • To: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Thu, 11 Aug 2022 20:59:05 +0100
  • Authentication-results: esa6.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Thu, 11 Aug 2022 19:59:44 +0000
  • Ironport-data: A9a23:9eRJ26qyzFAvQcwhfiU2OyeLqi9eBmJ2ZRIvgKrLsJaIsI4StFCzt garIBnUMv+Pa2WmfYx1YIzipE8Cv5CHydNlTVds+yhjF35Ap5uZCYyVIHmrMnLJJKUvbq7GA +byyDXkBJppJpMJjk71atANlVEliefSAOKU5NfsYkhZXRVjRDoqlSVtkus4hp8AqdWiCkaGt MiaT/f3YTdJ4BYpdDNPg06/gEk35q6q6GpB5gZWic1j5zcyqVFEVPrzGonpR5fIatE8NvK3Q e/F0Ia48gvxl/v6Ior4+lpTWhRiro/6ZWBiuFIPM0SRqkEqShgJ+rQ6LJIhhXJ/0F1lqTzTJ OJl7vRcQS9xVkHFdX90vxNwS0mSNoUekFPLzOTWXWV+ACQqflO1q8iCAn3aMqUi+P5wIkB21 sADE20SST2RqL6Zx7+CH7wEasQLdKEHPasas3BkizrYEewnUdbIRKCiCd1whWlqwJoURLCHO pRfOWEHgBfoOnWjPn8+Dp4kkfjurX74azBC83qepLYt4niVxwt0uFToGIWKJILWHZsK9qqej kDn0liiKUoKDfyexwah4nicn8nWgQquDer+E5Xnr6U30TV/3Fc7Fxk+RVa95/6jhSaWefhSN kgV8SoGtrUp+QqgSdyVdw21pjuIswARX/JUEvYm80edx6zM+QGbC2MYCDlbZ7QbWNQeHGJwk AXTxpWwWGIp4Ob9pW+hGqm8lzGqPgs0FUw+fhRZUiwo8fa/j4Y+t0eaJjp8K5JZnuEZCBmpn W7S9Hlh3uxN5SIY//7lpA6a2lpAsrCMF1dovVuPAwpJ+ysjPOaYi5qUBU83BBqqBKKQVRG/s XcNgKByB8heXMjWxERhrAjgdYxFBspp0xWG2DaD57F7q1yQF4eLJOi8Gg1WKkZzKdojcjT0e kLVsg45zMYNYiPzMPEqM9juW51CIU3c+TPNCJjpgidmOMAtJGdrAgk3DaJv44wduBd1yvxuU XtqWc2tEWwbGcxa8dZCfM9EiOdD7n1vmgvuqWXTlUvPPUy2OCHIEt/o8TKmMogE0U9ziF+Fr o8AbZvVkEg3vS+XSnC/zLP/5GsidRATba0aYeQNHgJfCmKKwF0cNsI=
  • Ironport-hdrordr: A9a23:rBmlYKigC92mcrvkZ9POMzXn2HBQXuIji2hC6mlwRA09TySZ// rBoB19726MtN9xYgBHpTnuAsm9qB/nmaKdpLNhWItKPzOW31dATrsSjrcKqgeIc0aVm9K1l5 0QF5SWYOeAdWSS5vya3ODXKbkdKaG8gcKVuds=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

One source of lost performance was that fact that to protect Xen from a
malicious guests, we had to flush the RAS.

It turns out that CET Shadow Stacks give us enough architectural guarantees to
construct a lower overhead mitigation, which keeps the RAS balanced for the
guest so their return performance is still good.

To keep the RAS balanced, Xen must execute the same number of CALLs as RETs
across one VMexit->VMEntry.  Without CET-SS, we could achieve this fairly
easily with a `call; add $8, %rsp` and `push; ret` pair, but this is not legal
under CET-SS.  In fact, CALL is the only shadow stack "push" operation we
have, and we can't use it a second time if we intend to keep the RAS balanced.

Instead, we keep a real return address on the stack.  This means that for some
of entry.S, %rsp conditionally doesn't reference CPUINFO.

This necessitates swapping the current order of DO_OVERWRITE_RSB and
svm_vmexit_spec_ctrl; while they don't have any specific ordering
requirements, push_one_ras needs to come after svm_vmexit_spec_ctrl or else we
need some very invasive changes to fix up the %rsp changes.

Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
---
CC: Jan Beulich <JBeulich@xxxxxxxx>
CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
CC: Wei Liu <wl@xxxxxxx>

RFC for a couple of reasons.  This does function correctly, but I still want
to do more perf testing.

Secondly, X86_FEATURE_ALWAYS is clearly not ok for committing.  I'm still
debating whether to make this construct available in !CET-SS cases.
Mechanically, its fine, but the safety arguments depend on CET-SS being
active.

In principle, on CPUs which do not suffer Branch Type Confusion, you might be
able to reason a defence-in-depth argument that if an attacker can't control
indirect speculation, then they can't bypass the 1-stuff safety either, but
the only AMD CPUs not vulnerable to BTC have CET-SS anyway.

Third, I'd like some early feedback on how clear it the logic is given the
conditional nature of %rsp not referencing CPUINFO.

Fourth, the alternatives logic (I think) needs improving to not fix up a
direct CALL/JMP displacement if the destination is within the replacement
length.  I did the functional testing before wrapping things in alternatives.
---
 xen/arch/x86/hvm/svm/entry.S | 55 ++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 53 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/entry.S b/xen/arch/x86/hvm/svm/entry.S
index be4ce52bd81d..98934db41fec 100644
--- a/xen/arch/x86/hvm/svm/entry.S
+++ b/xen/arch/x86/hvm/svm/entry.S
@@ -22,8 +22,41 @@
 #include <asm/asm_defns.h>
 #include <asm/page.h>
 
+.macro push_one_ras
+        /*
+         * Pushes one entry into the RAS, then updates the return address(es)
+         * to point at svm_ras_speculation_trap.
+         *
+         * Rogue RAS-speculation will hit the INT3 and stop.  Architectural
+         * execution will go to svm_ras_speculation_trap.
+         *
+         * This deliberately leaves the (modified) return address on the
+         * stack(s).
+         */
+        call 1f
+        int3
+1:
+        lea     svm_ras_speculation_trap(%rip), %rax
+
+#ifdef CONFIG_XEN_SHSTK
+        rdsspq  %rcx
+        wrssq   %rax, (%rcx)
+#endif
+        mov     %rax, (%rsp)
+.endm
+
 ENTRY(svm_asm_do_resume)
         GET_CURRENT(bx)
+
+        /*
+         * We've just been schedule()'d.  There's no speculation safety needed
+         * here, but we do need to set the stack up in the manner expected by
+         * later logic.
+         */
+        ALTERNATIVE "", push_one_ras, X86_FEATURE_ALWAYS
+
+        /* WARNING!  After this point, %rsp /may/ not reference cpu_info. */
+
 .Lsvm_do_resume:
         call svm_intr_assist
         call nsvm_vcpu_switch
@@ -56,6 +89,20 @@ __UNLIKELY_END(nsvm_hap)
         clgi
 
         /* WARNING! `ret`, `call *`, `jmp *` not safe beyond this point. */
+        /* WARNING!  Before this point, %rsp /may/ not reference cpu_info. */
+
+        /*
+         * If we're trying to balance the RAS for guests, push_one_ras in the
+         * VMExit path was necessary for speculative safety, but the on-stack
+         * return address was deliberately updated to point here.
+         *
+         * We execute one RET to re-balance the RAS.  It will mispredict (to
+         * the INT3 in push_one_ras in the general case), but won't
+         * architecturally change the instruction flow.
+         */
+        ALTERNATIVE "", ret, X86_FEATURE_ALWAYS
+svm_ras_speculation_trap:
+
         /* SPEC_CTRL_EXIT_TO_SVM       Req: b=curr %rsp=regs/cpuinfo, Clob: 
acd */
         .macro svm_vmentry_spec_ctrl
             mov    VCPU_arch_msrs(%rbx), %rax
@@ -108,8 +155,6 @@ __UNLIKELY_END(nsvm_hap)
        .endm
         ALTERNATIVE "", svm_vmexit_cond_ibpb, X86_FEATURE_IBPB_ENTRY_HVM
 
-        ALTERNATIVE "", DO_OVERWRITE_RSB, X86_FEATURE_SC_RSB_HVM
-
         .macro svm_vmexit_spec_ctrl
             movzbl CPUINFO_xen_spec_ctrl(%rsp), %eax
             movzbl CPUINFO_last_spec_ctrl(%rsp), %edx
@@ -122,6 +167,12 @@ __UNLIKELY_END(nsvm_hap)
 1:
         .endm
         ALTERNATIVE "", svm_vmexit_spec_ctrl, X86_FEATURE_SC_MSR_HVM
+
+        ALTERNATIVE_2 "", \
+             DO_OVERWRITE_RSB, X86_FEATURE_SC_RSB_HVM,  \
+             push_one_ras, X86_FEATURE_ALWAYS
+
+        /* WARNING!  After this point, %rsp /may/ not reference cpu_info. */
         /* WARNING! `ret`, `call *`, `jmp *` not safe before this point. */
 
         /*
-- 
2.11.0




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.