[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 1/2] VMX: fix VMCS race on context-switch paths

On Thu, 2017-02-16 at 04:15 -0700, Jan Beulich wrote:
> When __context_switch() is being bypassed during original context
> switch handling, the vCPU "owning" the VMCS partially loses control of
> it: It will appear non-running to remote CPUs, and hence their attempt
> to pause the owning vCPU will have no effect on it (as it already
> looks to be paused). At the same time the "owning" CPU will re-enable
> interrupts eventually (the lastest when entering the idle loop) and
> hence becomes subject to IPIs from other CPUs requesting access to the
> VMCS. As a result, when __context_switch() finally gets run, the CPU
> may no longer have the VMCS loaded, and hence any accesses to it would
> fail. Hence we may need to re-load the VMCS in vmx_ctxt_switch_from().
> Similarly, when __context_switch() is being bypassed also on the second
> (switch-in) path, VMCS ownership may have been lost and hence needs
> re-establishing. Since there's no existing hook to put this in, add a
> new one.

This paragraph now has to be replaced with something about
vmx_do_resume() change.

> Reported-by: Kevin Mayer <Kevin.Mayer@xxxxxxxx>
> Reported-by: Anshul Makkar <anshul.makkar@xxxxxxxxxx>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> v2: Drop the spin loop from vmx_vmc_reload(). Use the function in
>     vmx_do_resume() instead of open coding it there (requiring the
>     ASSERT()s to be adjusted/dropped). Drop the new
>     ->ctxt_switch_same() hook.

For the code itself:

Reviewed-by: Sergey Dyasli <sergey.dyasli@xxxxxxxxxx>

And since night testing of the PML scenario (reboot of 32 VMs)
didn't find any issues:

Tested-by: Sergey Dyasli <sergey.dyasli@xxxxxxxxxx>

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.