[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v15 01/11] multicall: add no preemption ability between two calls



>>> On 18.09.14 at 15:45, <chao.p.peng@xxxxxxxxxxxxxxx> wrote:
> On Wed, Sep 17, 2014 at 10:44:12AM +0100, Jan Beulich wrote:
>> >>> On 17.09.14 at 11:22, <chao.p.peng@xxxxxxxxxxxxxxx> wrote:
>> > On Fri, Sep 12, 2014 at 10:55:43AM +0800, Chao Peng wrote:
>> >> On Wed, Sep 10, 2014 at 12:12:07PM +0100, Andrew Cooper wrote:
>> >> > On 10/09/14 11:25, Jan Beulich wrote:
>> >> > >>>> On 10.09.14 at 12:15, <andrew.cooper3@xxxxxxxxxx> wrote:
>> >> > >> On 10/09/14 11:07, Jan Beulich wrote:
>> >> > >>>>>> On 10.09.14 at 11:43, <andrew.cooper3@xxxxxxxxxx> wrote:
>> >> > >>>> Actually, on further thought, using multicalls like this cannot 
>> >> > >>>> possibly
>> >> > >>>> be correct from a functional point of view.
>> >> > >>>>
>> >> > >>>> Even with the no preempt flag between a wrmsr/rdmsr hypercall pair,
>> >> > >>>> there is no guarantee that accesses to remote cpus msrs won't 
>> >> > >>>> interleave
>> >> > >>>> with a different natural access, clobbering the results of the 
>> >> > >>>> wrmsr.
>> >> > >>>>
>> >> > >>>> However this is solved, the wrmsr/rdmsr pair *must* be part of the 
>> >> > >>>> same
>> >> > >>>> synchronous thread of execution on the appropriate cpu.  You can 
>> >> > >>>> trust
>> >> > >>>> that interrupts won't play with these msrs, but you absolutely 
>> >> > >>>> can't
>> >> > >>>> guarantee that IPI/wrmsr/IPI/rdmsr will work.
>> >> > >>> Not sure I follow, particularly in the context of the white listing 
>> >> > >>> of
>> >> > >>> MSRs permitted here (which ought to not include anything the
>> >> > >>> hypervisor needs control over).
>> >> > >> Consider two dom0 vcpus both using this new multicall mechanism to 
>> >> > >> read
>> >> > >> QoS information for different domains, which end up both targeting 
>> >> > >> the
>> >> > >> same remote cpu.  They will both end up using IPI/wrmsr/IPI/rdmsr, 
>> >> > >> which
>> >> > >> may interleave and clobber the first wrmsr.
>> >> > > But that situation doesn't result from the multicall use here - it 
>> >> > > would
>> >> > > equally be the case for an inherently batchable hypercall.
>> >> > 
>> >> > Indeed - I called out multicall because of the current implementation,
>> >> > but I should have been more clear.
>> >> > 
>> >> > > To deal with
>> >> > > that we'd need a wrmsr-then-rdmsr operation, or move the entire
>> >> > > execution of the batch onto the target CPU. Since the former would
>> >> > > quickly become unwieldy for more complex operations, I think this
>> >> > > gets us back to aiming at using continue_hypercall_on_cpu() here.
>> >> > 
>> >> > Which gets us back to the problem that you cannot use
>> >> > copy_{to,from}_guest() after continue_hypercall_on_cpu(), due to being
>> >> > in the wrong context.
>> >> > 
>> >> > 
>> >> > I think this requires a step back and rethink.  I can't offhand think of
>> >> > any combination of existing bits of infrastructure which will allow this
>> >> > to work correctly, which means something new needs designing.
>> >> > 
>> >> How about this:
>> >> 
>> >> 1)  Still do the batch in do_platform_op() but add a iteration field in
>> >> the interface structure.
>> >> 
>> >> 2)  Still use on_selected_cpus() but group the adjacent resource_ops
>> >> which have a same cpu and NO_PREEMPT set into one and do it as a whole
>> >> in the new cpu context.
>> >> 
>> > Any suggestion for this?
>> 
>> 1 is ugly (contradicting everything we do elsewhere), but would be a
>> last resort option.
>> 
>> 2 would be perhaps an option if small, non-preemptible batches
>> would be handled in do_platform_op() while preemptible larger
>> groups then ought to use the multicall interface.
>> 
>> Option 3 would be to fiddle with the current vCPU's affinity before
>> invoking a continuation (perhaps already on the first iteration to
>> get onto the needed pCPU).
>> 
> Thanks Jan.
> 
> On further thought, I think we may over design for this.
> 
> Why not make it simple and also scalable?
> The answer is also simple: do_platform_op() is always non-preemptible.
> 
> It can accept one operation or small batch of operations but it
> guarantees all the operations are non-preemptible. (eg it never calls
> hypercall_create_continuation() )
> It's the minimum unit for non-preemptible operation.
> 
> If the caller(userspace tool) wants to make preemptible batch calls,
> then multicall mechanism can be employed. 
> We don't need to add NO_PREEMPT ability for multicall. Just keep it
> preemptible.
> 
> This is almost option 2 above.

Right, this is what I described for option 2 above.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.