[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] RFC: QEMU bumping memory limit and domain restore



On Wed, Jun 03, 2015 at 02:22:25PM +0100, George Dunlap wrote:
> On Tue, Jun 2, 2015 at 3:05 PM, Wei Liu <wei.liu2@xxxxxxxxxx> wrote:
> > Previous discussion at [0].
> >
> > For the benefit of discussion, we refer to max_memkb inside hypervisor
> > as hv_max_memkb (name subject to improvement). That's the maximum number
> > of memory a domain can use.
> 
> Why don't we try to use "memory" for virtual RAM that we report to the
> guest, and "pages" for what exists inside the hypervisor?  "Pages" is
> the term the hypervisor itself uses internally (i.e., set_max_mem()
> actually changes a domain's max_pages value).
> 
> So in this case both guest memory and option roms are created using
> hypervisor pages.
> 
> > Libxl doesn't know hv_max_memkb for a domain needs prior to QEMU start-up
> > because of optional ROMs etc.
> 
> So a translation of this using "memory/pages" terminology would be:
> 
> QEMU may need extra pages from Xen to implement option ROMS, and so at
> the moment it calls set_max_mem() to increase max_pages so that it can
> allocate more pages to the guest.  libxl doesn't know what max_pages a
> domain needs prior to qemu start-up.
> 
> > Libxl doesn't know the hv_max_memkb even after QEMU start-up, because
> > there is no mechanism to community between QEMU and libxl. This is an
> > area that needs improvement, we've encountered problems in this area
> > before.
> 
> [translating]
> Libxl doesn't know max_pages  even after qemu start-up, because there
> is no mechanism to communicate between qemu and libxl.
> 
> > QEMU calls xc_domain_setmaxmem to increase hv_max_memkb by N pages. Those
> > pages are only accounted in hypervisor. During migration, libxl
> > (currently) doesn't extract that value from hypervisor.
> 
> [translating]
> qemu calls xc_domain_setmaxmem to increase max_pages by N pages.
> Those pages are only accounted for in the hypervisor.  libxl
> (currently) does not extract that value from the hypervisor.
> 
> > So now the problem is on the remote end:
> >
> > 1. Libxl indicates domain needs X pages.
> > 2. Domain actually needs X + N pages.
> > 3. Remote end tries to write N more pages and fail.
> >
> > This behaviour currently doesn't affect normal migration (that you
> > transfer libxl JSON to remote, construct a domain, then start QEMU)
> > because QEMU won't bump hv_max_memkb again. This is by design and
> > reflected in QEMU code.
> 
> I don't understand this paragraph -- does the remote domain actually
> need X+N pages or not?  If it does, in what way does this behavior
> "not affect normal migration"?
> 

I was wrong. I don't recollect how I came to that conclusion. It does
affect normal migration.

> > This behaviour affects COLO and becomes a bug in that case, because
> > secondary VM's QEMU doesn't go through the same start-of-day
> > initialisation (Hongyang, correct me if I'm wrong), i.e. no bumping
> > hv_max_memkb inside QEMU.
> >
> > Andrew plans to embed JSON inside migration v2 and COLO is based on
> > migration v2. The bug is fixed if JSON is correct in the first place.
> >
> > As COLO is not yet upstream, so this bug is not a blocker for 4.6. But
> > it should be fixed for the benefit of COLO.
> >
> > So here is a proof of concept patch to record and honour that value
> > during migration.  A new field is added in IDL. Note that we don't
> > provide xl level config option for it and mandate it to be default value
> > during domain creation. This is to prevent libxl user from using it to
> > avoid unforeseen repercussions.
> >
> > This patch is compiled test only. If we agree this is the way to go I
> > will test and submit a proper patch.
> 
> Reading max_pages from Xen and setting it on the far side seems like a
> reasonable option.  Is there a reason we can't add a magic XC_SAVE_ID

Yes. That's the correct behaviour we want to have. The question is where
should we put that value and when to set it.

> for v1, like we do for other parameters?
> 

The main objection is that we shouldn't call xc_domain_setmaxmem in the
middle of a migration stream.

Wei.

>  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.