[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] VMX: sync CPU state upon vCPU destruction



On 10/11/17 10:30, Jan Beulich wrote:
>>>> On 10.11.17 at 09:41, <sergey.dyasli@xxxxxxxxxx> wrote:
>> On Thu, 2017-11-09 at 07:49 -0700, Jan Beulich wrote:
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -479,7 +479,13 @@ static void vmx_vcpu_destroy(struct vcpu
>>>       * we should disable PML manually here. Note that vmx_vcpu_destroy is 
>>> called
>>>       * prior to vmx_domain_destroy so we need to disable PML for each vcpu
>>>       * separately here.
>>> +     *
>>> +     * Before doing that though, flush all state for the vCPU previously 
>>> having
>>> +     * run on the current CPU, so that this flushing of state won't happen 
>>> from
>>> +     * the TLB flush IPI handler behind the back of a vmx_vmcs_enter() /
>>> +     * vmx_vmcs_exit() section.
>>>       */
>>> +    sync_local_execstate();
>>>      vmx_vcpu_disable_pml(v);
>>>      vmx_destroy_vmcs(v);
>>>      passive_domain_destroy(v);
>>
>> This patch fixes only one particular issue and not the general problem.
>> What if vmcs is cleared, possibly in some future code, at another place?
> 
> As indicated in the earlier discussion, if we go this route, other
> future async accesses may need to do the same then.
> 
>> The original intent of vmx_vmcs_reload() is correct: it lazily loads
>> the vmcs when it's needed. It's just the logic which checks for
>> v->is_running inside vmx_ctxt_switch_from() is flawed: v might be
>> "running" on another pCPU.
>>
>> IMHO there are 2 possible solutions:
>>
>>     1. Add additional pCPU check into vmx_ctxt_switch_from()
> 
> I agree with Dario in not seeing this as a possible solution.
> 
>>    2. Drop v->is_running check inside vmx_ctxt_switch_from() making
>>        vmx_vmcs_reload() unconditional.
> 
> This is an option, indeed (and I don't think it would have a
> meaningful performance impact, as vmx_vmcs_reload() does
> nothing if the right VMCS is already in place). Iirc I had added the
> conditional back then merely to introduce as little of a behavioral
> change as was (appeared to be at that time) necessary. What I'm
> not certain about, however, is the final state we'll end up in then.
> Coming back to your flow scheme (altered to represent the
> suggested new flow):
> 

I was thinking of this approach for a while and I couldn't find anything
dangerous that can be potentially done by vmcs_reload() since it looks
like that it already has all the necessary checks inside.

> pCPU1                                   pCPU2
> =====                                   =====
> current == vCPU1
> context_switch(next == idle)
> !! __context_switch() is skipped
> vcpu_migrate(vCPU1)
> RCU callbacks
> vmx_vcpu_destroy()
> vmx_vcpu_disable_pml()
> current_vmcs = 0
> 
>                                         schedule(next == vCPU1)
>                                         vCPU1->is_running = 1;
>                                         context_switch(next == vCPU1)
>                                         flush_tlb_mask(&dirty_mask);
>                                     
>                                 <--- IPI
> 
> __sync_local_execstate()
> __context_switch(prev == vCPU1)
> vmx_ctxt_switch_from(vCPU1)
> vmx_vmcs_reload()
> ...
> 
> We'd now leave the being destroyed vCPU's VMCS active in pCPU1
> (at least I can't see where it would be deactivated again).

This would be VMCS of the migrated vCPU - not the destroyed one.

Igor

> Overall I think it is quite reasonable to terminate early a lazy
> context switch of a vCPU under destruction. From that abstract
> consideration, forcing this higher up the call stack of
> vmx_vcpu_destroy() (as I had suggested as an alternative
> previously, before actually moving it further down into VMX code,
> perhaps even right in RCU handling) would continue to be an
> option. In this context you may want to pay particular
> attention to the description of 346da00456 ("Synchronise lazy
> execstate before calling tasklet handlers").
> 
> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.