[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v2] x86/PV: issue branch prediction barrier when switching 64-bit guest to kernel mode


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 19 Jul 2022 14:55:17 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=MIKUVCNislRgAc5Hk4ifVRBpK6kd9oa3qSRRPckftEg=; b=iMwd/9Trj5/OfhnW8hgIR9V9F+SpCjzmXHbPvZxzcvg35CuBm4ziELXeRArL1ctwJ6/nuMWKyM539eLmAr7Lo0Kk9i0CLA2p0w0uumb5DDTTkoSECub4OVFkz/QEUhK0WmPMS8AP8UQ0IFqPEdxW1kBftTGXhN8ceRdMZUXD8n0tRilQmO+1rcUyi8TkeNixVDFwZo3ZdgM7xei07BOLHpFGrs7PK6AxPLKJeIhKJjv8YJfrXi3+5pKkbzx1BrmQxUsQg876YcWQBslMBNFreHNvzTKYDcdQ8MbiVI5qIciWQxQZ2vucRZsbHYIBqwpNbP10mKEq+VUMKh492oomsQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Kee+KzCvOLUAQesxG6oVa3RdJueFyQ1xQFHawcY9gLhTC0A7l8bbVz6m8jfaZVr9fs7XNrh25goXGzCWST6bW8xNYZZc4tgd6PRqSPnJ7eOvB5hNVcBNjsJofZAds3M2zHHuCywI0ol1cvmLaYZJPbhXGhUaZoLP9CK78n3fIuxCg2CtGACteOuEGtyHvNb858jz8nybElS3zx6tGjccJdQ43d2p3usTU7AvJnO3jTOqNeY05H3j+ipSHzKCvRZQs3UuZ+pQgxL3FI01vWPPidsnufDbqsn+gyXQG4T0y6JAA2y5WbWiohBcfY5/izzKLWaUFLQb8yjx9o8qWwyWMw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Tue, 19 Jul 2022 12:55:22 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Since both kernel and user mode run in ring 3, they run in the same
"predictor mode". While the kernel could take care of this itself, doing
so would be yet another item distinguishing PV from native. Additionally
we're in a much better position to issue the barrier command, and we can
save a #GP (for privileged instruction emulation) this way.

To allow to recover performance, introduce a new VM assist allowing the guest
kernel to suppress this barrier.

Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
---
v2: Leverage entry-IBPB. Add VM assist. Re-base.
---
I'm not entirely happy with re-using opt_ibpb_ctxt_switch here (it's a
mode switch after all, but v1 used opt_ibpb here), but it also didn't
seem very reasonable to introduce yet another command line option. The
only feasible alternative I would see is to check the CPUID bits directly.

--- a/xen/arch/x86/include/asm/domain.h
+++ b/xen/arch/x86/include/asm/domain.h
@@ -757,7 +757,8 @@ static inline void pv_inject_sw_interrup
  * but we can't make such requests fail all of the sudden.
  */
 #define PV64_VM_ASSIST_MASK (PV32_VM_ASSIST_MASK                      | \
-                             (1UL << VMASST_TYPE_m2p_strict))
+                             (1UL << VMASST_TYPE_m2p_strict)          | \
+                             (1UL << VMASST_TYPE_mode_switch_no_ibpb))
 #define HVM_VM_ASSIST_MASK  (1UL << VMASST_TYPE_runstate_update_flag)
 
 #define arch_vm_assist_valid_mask(d) \
--- a/xen/arch/x86/pv/domain.c
+++ b/xen/arch/x86/pv/domain.c
@@ -467,7 +467,15 @@ void toggle_guest_mode(struct vcpu *v)
     if ( v->arch.flags & TF_kernel_mode )
         v->arch.pv.gs_base_kernel = gs_base;
     else
+    {
         v->arch.pv.gs_base_user = gs_base;
+
+        if ( opt_ibpb_ctxt_switch &&
+             !(d->arch.spec_ctrl_flags & SCF_entry_ibpb) &&
+             !VM_ASSIST(d, mode_switch_no_ibpb) )
+            wrmsrl(MSR_PRED_CMD, PRED_CMD_IBPB);
+    }
+
     asm volatile ( "swapgs" );
 
     _toggle_guest_pt(v);
--- a/xen/include/public/xen.h
+++ b/xen/include/public/xen.h
@@ -571,6 +571,16 @@ DEFINE_XEN_GUEST_HANDLE(mmuext_op_t);
  */
 #define VMASST_TYPE_m2p_strict           32
 
+/*
+ * x86-64 guests: Suppress IBPB on guest-user to guest-kernel mode switch.
+ *
+ * By default (on affected and capable hardware) as a safety measure Xen,
+ * to cover for the fact that guest-kernel and guest-user modes are both
+ * running in ring 3 (and hence share prediction context), would issue a
+ * barrier for user->kernel mode switches of PV guests.
+ */
+#define VMASST_TYPE_mode_switch_no_ibpb  33
+
 #if __XEN_INTERFACE_VERSION__ < 0x00040600
 #define MAX_VMASST_TYPE                  3
 #endif



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.