[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/vm_event: fix race between vmx_vmexit_handler() and vm_event_resume()



On 04/27/17 11:01, Jan Beulich wrote:
>>>> On 27.04.17 at 09:22, <rcojocaru@xxxxxxxxxxxxxxx> wrote:
>> The introspection agent can reply to a vm_event faster than
>> vmx_vmexit_handler() can complete in some cases, where it is then
>> not safe for vm_event_set_registers() to modify v->arch.user_regs.
> 
> This needs more explanation: I cannot see the connection between
> vmx_vmexit_handler() completing and (un)safety of modification of
> v->arch.user_regs. The latter is unsafe as long as the vCPU hasn't
> gone through __context_switch(), and the former doesn't call that
> function directly or indirectly (i.e. I think it is more than just
> vmx_vmexit_handler() which needs to be completed by the time
> register modification becomes safe to do).

Indeed, this is exactly the case (__context_switch() doesn't go
through). I'll re-word the commit message.

>> This patch ensures that vm_event_resume() code only sets per-VCPU
>> data to be used for the actual setting of registers only in
>> hvm_do_resume() (similar to the model used to control setting of CRs
>> and MSRs).
> 
> I think the second "only" would better be "later".

Yes, it's actually redundant (sorry). I'll change it to "later".

>> The patch additionally removes the sync_vcpu_execstate(v) call from
>> vm_event_resume(), which is no longer necessary, which removes the
>> associated broadcast TLB flush (read: performance improvement).
> 
> Depending on the better explanation above, it may or may not be
> further necessary to also better explain the "no longer necessary"
> part here.
> 
>> --- a/xen/arch/x86/hvm/hvm.c
>> +++ b/xen/arch/x86/hvm/hvm.c
>> @@ -473,6 +473,39 @@ static bool hvm_get_pending_event(struct vcpu *v, 
>> struct x86_event *info)
>>      return hvm_funcs.get_pending_event(v, info);
>>  }
>>  
>> +static void hvm_vm_event_set_registers(struct vcpu *v)
> 
> This could be const, as *v itself isn't being modified, but maybe it's
> better to leave it non-const indeed.

I have no problem changing it if there are no objections.

>> +{
>> +    ASSERT(v == current);
>> +
>> +    if ( v->arch.vm_event->set_gprs )
> 
> I think we will want an unlikely() here. We anyway can only hope for
> the compiler to always inline this function, such that non-VM-event
> setups don't get penalized by the extra call here. Strictly speaking
> the function doesn't belong into this file ...

Should I move it to the x86 vm_event code?

>> +    {
>> +        struct cpu_user_regs *regs = guest_cpu_user_regs();
>> +
>> +        regs->rax = v->arch.vm_event->gprs.rax;
>> +        regs->rbx = v->arch.vm_event->gprs.rbx;
>> +        regs->rcx = v->arch.vm_event->gprs.rcx;
>> +        regs->rdx = v->arch.vm_event->gprs.rdx;
>> +        regs->rsp = v->arch.vm_event->gprs.rsp;
>> +        regs->rbp = v->arch.vm_event->gprs.rbp;
>> +        regs->rsi = v->arch.vm_event->gprs.rsi;
>> +        regs->rdi = v->arch.vm_event->gprs.rdi;
>> +
>> +        regs->r8 = v->arch.vm_event->gprs.r8;
>> +        regs->r9 = v->arch.vm_event->gprs.r9;
>> +        regs->r10 = v->arch.vm_event->gprs.r10;
>> +        regs->r11 = v->arch.vm_event->gprs.r11;
>> +        regs->r12 = v->arch.vm_event->gprs.r12;
>> +        regs->r13 = v->arch.vm_event->gprs.r13;
>> +        regs->r14 = v->arch.vm_event->gprs.r14;
>> +        regs->r15 = v->arch.vm_event->gprs.r15;
>> +
>> +        regs->rflags = v->arch.vm_event->gprs.rflags;
>> +        regs->rip = v->arch.vm_event->gprs.rip;
>> +
>> +        v->arch.vm_event->set_gprs = 0;
> 
> false (and true/bool elsewhere)

I'll change it to bool.

>> --- a/xen/common/vm_event.c
>> +++ b/xen/common/vm_event.c
>> @@ -357,6 +357,13 @@ void vm_event_resume(struct domain *d, struct 
>> vm_event_domain *ved)
>>  {
>>      vm_event_response_t rsp;
>>  
>> +    /*
>> +     * vm_event_resume() runs either from XEN_DOMCTL_VM_EVENT_OP_*, or
>> +     * EVTCHN_send from the introspection consumer.  Both contexts are
>> +     * guaranteed not to be the subject of vm_event responses.
>> +     */
>> +    ASSERT(d != current->domain);
> 
> What is this meant to guard against? It surely isn't ...
> 
>> @@ -375,13 +382,6 @@ void vm_event_resume(struct domain *d, struct 
>> vm_event_domain *ved)
>>          v = d->vcpu[rsp.vcpu_id];
>>  
>>          /*
>> -         * Make sure the vCPU state has been synchronized for the custom
>> -         * handlers.
>> -         */
>> -        if ( atomic_read(&v->vm_event_pause_count) )
>> -            sync_vcpu_execstate(v);
> 
> ... a "replacement" for this, as the state of "current" doesn't reflect
> whether register state has been saved (that's this_cpu(curr_vcpu)
> instead). Nor would a comparison of domains seem to be the right
> thing - a comparison of vcpus ought to suffice (and be less strict,
> allowing for something hypothetical like self-introspection).

This part (the ASSERT and comment) is from Andrew, he can help us here.


Thanks,
Razvan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.