[Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths

When __context_switch() is being bypassed during original context
switch handling, the vCPU "owning" the VMCS partially loses control of
it: It will appear non-running to remote CPUs, and hence their attempt
to pause the owning vCPU will have no effect on it (as it already
looks to be paused). At the same time the "owning" CPU will re-enable
interrupts eventually (the lastest when entering the idle loop) and
hence becomes subject to IPIs from other CPUs requesting access to the
VMCS. As a result, when __context_switch() finally gets run, the CPU
may no longer have the VMCS loaded, and hence any accesses to it would
fail. Hence we may need to re-load the VMCS in vmx_ctxt_switch_from().

Similarly, when __context_switch() is being bypassed also on the second
(switch-in) path, VMCS ownership may have been lost and hence needs
re-establishing. Since there's no existing hook to put this in, add a
new one.

Reported-by: Kevin Mayer <Kevin.Mayer@xxxxxxxx>
Reported-by: Anshul Makkar <anshul.makkar@xxxxxxxxxx>
Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
v2: Drop the spin loop from vmx_vmc_reload(). Use the function in
    vmx_do_resume() instead of open coding it there (requiring the
    ASSERT()s to be adjusted/dropped). Drop the new
    ->ctxt_switch_same() hook.

--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -552,6 +552,20 @@ static void vmx_load_vmcs(struct vcpu *v
+void vmx_vmcs_reload(struct vcpu *v)
+    /*
+     * As we may be running with interrupts disabled, we can't acquire
+     * v->arch.hvm_vmx.vmcs_lock here. However, with interrupts disabled
+     * the VMCS can't be taken away from us anymore if we still own it.
+     */
+    ASSERT(v->is_running || !local_irq_is_enabled());
+    if ( v->arch.hvm_vmx.vmcs_pa == this_cpu(current_vmcs) )
+        return;
+    vmx_load_vmcs(v);
 int vmx_cpu_up_prepare(unsigned int cpu)
@@ -1678,10 +1692,7 @@ void vmx_do_resume(struct vcpu *v)
     bool_t debug_state;
     if ( v->arch.hvm_vmx.active_cpu == smp_processor_id() )
-    {
-        if ( v->arch.hvm_vmx.vmcs_pa != this_cpu(current_vmcs) )
-            vmx_load_vmcs(v);
-    }
+        vmx_vmcs_reload(v);
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -936,6 +937,18 @@ static void vmx_ctxt_switch_from(struct
     if ( unlikely(!this_cpu(vmxon)) )
+    if ( !v->is_running )
+    {
+        /*
+         * When this vCPU isn't marked as running anymore, a remote pCPU's
+         * attempt to pause us (from vmx_vmcs_enter()) won't have a reason
+         * to spin in vcpu_sleep_sync(), and hence that pCPU might have taken
+         * away the VMCS from us. As we're running with interrupts disabled,
+         * we also can't call vmx_vmcs_enter().
+         */
+        vmx_vmcs_reload(v);
+    }
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -174,6 +174,7 @@ void vmx_destroy_vmcs(struct vcpu *v);
 void vmx_vmcs_enter(struct vcpu *v);
 bool_t __must_check vmx_vmcs_try_enter(struct vcpu *v);
 void vmx_vmcs_exit(struct vcpu *v);
+void vmx_vmcs_reload(struct vcpu *v);
 #define CPU_BASED_VIRTUAL_INTR_PENDING        0x00000004
 #define CPU_BASED_USE_TSC_OFFSETING           0x00000008

