[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Live migration and PV device handling



> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of Dongli 
> Zhang
> Sent: 03 April 2020 23:33
> To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>; Anastassios Nanos 
> <anastassios.nanos@xxxxxxxxxxx>; xen-
> devel@xxxxxxxxxxxxx
> Subject: Re: Live migration and PV device handling
> 
> Hi Andrew,
> 
> On 4/3/20 5:42 AM, Andrew Cooper wrote:
> > On 03/04/2020 13:32, Anastassios Nanos wrote:
> >> Hi all,
> >>
> >> I am trying to understand how live-migration happens in xen. I am
> >> looking in the HVM guest case and I have dug into the relevant parts
> >> of the toolstack and the hypervisor regarding memory, vCPU context
> >> etc.
> >>
> >> In particular, I am interested in how PV device migration happens. I
> >> assume that the guest is not aware of any suspend/resume operations
> >> being done
> >
> > Sadly, this assumption is not correct.  HVM guests with PV drivers
> > currently have to be aware in exactly the same way as PV guests.
> >
> > Work is in progress to try and address this.  See
> > https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=775a02452ddf3a6889690de90b1a94eb29c3c732
> > (sorry - for some reason that doc isn't being rendered properly in
> > https://xenbits.xen.org/docs/ )
> >
> 
> I read below from the commit:
> 
> +* The toolstack choose a randomized domid for initial creation or default
> +migration, but preserve the source domid non-cooperative migration.
> +Non-Cooperative migration will have to be denied if the domid is
> +unavailable on the target host, but randomization of domid on creation
> +should hopefully minimize the likelihood of this. Non-Cooperative migration
> +to localhost will clearly not be possible.
> 
> Does that indicate while scope of domid_t is shared by a single server in old
> design, the scope of domid_t is shared by a cluster of server in new design?
> 
> That is, the domid should be unique in the cluster of all servers if we expect
> non-cooperative migration always succeed?
> 

That would be necessary to guarantee success (or rather guarantee no failure 
due to domid clash) but the scope of xl/libxl is single serve, hence 
randomization is the best we have to reduce clashes to a minimum.

  Paul




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.