[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/3] VMX: Properly adjuest the status of pi descriptor



>>> On 23.05.16 at 09:16, <feng.wu@xxxxxxxxx> wrote:

> 
>> -----Original Message-----
>> From: Tian, Kevin
>> Sent: Monday, May 23, 2016 2:52 PM
>> To: Wu, Feng <feng.wu@xxxxxxxxx>; xen-devel@xxxxxxxxxxxxx 
>> Cc: keir@xxxxxxx; jbeulich@xxxxxxxx; andrew.cooper3@xxxxxxxxxx;
>> george.dunlap@xxxxxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
>> konrad.wilk@xxxxxxxxxx 
>> Subject: RE: [PATCH 1/3] VMX: Properly adjuest the status of pi descriptor
>> 
>> > From: Wu, Feng
>> > Sent: Monday, May 23, 2016 1:28 PM
>> >
>> >
>> > > -----Original Message-----
>> > > From: Tian, Kevin
>> > > Sent: Monday, May 23, 2016 1:15 PM
>> > > To: Wu, Feng <feng.wu@xxxxxxxxx>; xen-devel@xxxxxxxxxxxxx 
>> > > Cc: keir@xxxxxxx; jbeulich@xxxxxxxx; andrew.cooper3@xxxxxxxxxx;
>> > > george.dunlap@xxxxxxxxxxxxx; dario.faggioli@xxxxxxxxxx;
>> > > konrad.wilk@xxxxxxxxxx 
>> > > Subject: RE: [PATCH 1/3] VMX: Properly adjuest the status of pi 
>> > > descriptor
>> > >
>> > > > From: Wu, Feng
>> > > > Sent: Friday, May 20, 2016 4:54 PM
>> > > >
>> > > > When the last assigned device is dettached from the domain, all
>> > > > the PI related hooks are removed then, however, the vCPU can be
>> > > > blocked, switched to another pCPU, etc, all without the aware of
>> > > > PI. After the next time we attach another device to the domain,
>> > > > which makes the PI realted hooks avaliable again, the status
>> > > > of the pi descriptor is not true, we need to properly adjust
>> > > > it.
>> > >
>> > > Instead of adjusting pi descriptor in multiple places, can we
>> > > simply reset the status (including removing from block list)
>> > > right when hooks are removed at deattach?
>> > >
>> >
>> > I also thought about this idea before, the problem is that when
>> > the hooks are being removed at the pci device's deattachment,
>> > the hooks may be running at the same time, which may cause
>> > problems, such as, after we have removed the vCPU from the
>> > blocking list, vmx_vcpu_block() (which has been running before the
>> > hooks were removed and not finished yet.) adds a vCPU to the same
>> > blocking list, then this vCPU will remain in the list after the device is
>> > deattached from the domain. At the same reason, if we change PI desc
>> > in vmx_pi_hooks_deassign() while it is being used in the hooks, it
>> > may cause problems.
>> >
>> 
>> It's not just about updating PI desc. The race could exist for changing
>> hooks too:
>> 
>> void vmx_pi_hooks_deassign(struct domain *d)
>>      ...
>>      d->arch.hvm_domain.vmx.pi_switch_from = NULL;
>>      ...
>> 
>> static void vmx_ctxt_switch_from(struct vcpu *v)
>>      ...
>>      if ( v->domain->arch.hvm_domain.vmx.pi_switch_from )
>>         v->domain->arch.hvm_domain.vmx.pi_switch_from(v);
>> 
>> If the callback is cleared between above two lines, you'll refer to
>> a NULL pointer in the 2nd line.
>> 
>> I thought there should be some protection mechanism in place for
>> such fundamental race, which might be extended to cover PI desc
>> too. But looks not the case. Would you mind pointing it out if already
>> well handled? :-)
> 
> You are right, that can happen. However, in this case, it will not introduce
> any bad impacts other than the PI desc is remaining the old value.

How that, if you agree that what Kevin described can happen? He's
pointing out a possible NULL deref...

>> Somehow I'm thinking whether we really need such dynamic
>> callback changes in a SMP environment. Is it safer to introduce
>> some per-domain flag to indicate above scenario so each callback
>> can return early with negligible cost so no need to reset them once
>> registered?
> 
> Yes, I agree with you. As you can see, the current issue is kind of because
> of the dynamically assigning/deassigning the pi hooks, and there are
> lots of concern cases by using it. I think it worth try what you mentioned
> above. However, this method was sugguested by Jan before, so I'd like
> to know what is Jan's option about this.

Avoiding the indirect calls would certainly be nice. Apart from
reading the function pointer into a local variable, use of which is
separated from the read by a barrier() (to address the NULL
deref concern), did you consider zapping the pointers in an RCU
callback instead of right when being requested?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.