Re: [Xen-devel] PML causing race condition during guest bootstorm and host crash on Broadwell cpu.

>>> On 07.02.17 at 18:26, <anshul.makkar@xxxxxxxxxx> wrote:
> Facing a issue where bootstorm of guests leads to host crash. I debugged 
> and found that that enabling PML  introduces a  race condition during 
> guest teardown stage while disabling PML on a vcpu  and context switch 
> happening for another vcpu.
> Crash happens only on Broadwell processors as PML got introduced in this 
> generation.
> Here is my analysis:
> Race condition:
> vmcs.c vmx_vcpu_disable_pml (vcpu){ vmx_vmcs_enter() ; vm_write( 
> disable_PML); vmx_vmcx_exit();)
> In between I have a callpath from another pcpu executing context 
> switch-> vmx_fpu_leave() and crash on vmwrite..
>    if ( !(v->arch.hvm_vmx.host_cr0 & X86_CR0_TS) )
> {
>           v->arch.hvm_vmx.host_cr0 |= X86_CR0_TS;
>           __vmwrite(HOST_CR0, v->arch.hvm_vmx.host_cr0);  //crash
>       }

So that's after current has changed already, so it's effectively
dealing with a foreign VMCS, but it doesn't use vmx_vmcs_enter().
The locking done in vmx_vmcs_try_enter() / vmx_vmcs_exit(),
however, assumes that any user of a VMCS either owns the lock
or has current as the owner of the VMCS. Of course such a call
also can't be added here, as a vcpu on the context-switch-from
path can't vcpu_pause() itself.

That taken together with the bypassing of __context_switch()
when the incoming vCPU is the idle one (which means that via
context_saved() ->is_running will be cleared before running
->ctxt_switch_from()), the vcpu_pause() invocation in
vmx_vmcs_try_enter() may not have to wait at all if the call
happens between the clearing of ->is_running and the
eventual invocation of vmx_ctxt_switch_from().

If the above makes sense (which I'm not sure at all), the
question is whether using this_cpu(curr_vcpu) instead of
current in the VMCS enter/exit functions would help.


