[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] Revert "xen-hvm: increase maxmem before calling xc_domain_populate_physmap"



On Tue, 2015-06-16 at 16:56 +0100, George Dunlap wrote:
> On 06/16/2015 04:51 PM, Stefano Stabellini wrote:
> > On Tue, 16 Jun 2015, Wei Liu wrote:
> >> On Wed, Jun 10, 2015 at 01:55:13PM +0100, George Dunlap wrote:
> >>> This reverts commit c1d322e6048796296555dd36fdd102d7fa2f50bf.
> >>>
> >>> The original commit fixes a bug when assigning a large number of
> >>> devices which require option roms to a guest.  (One known
> >>> configuration that needs extra memory is having more than 4 emulated
> >>> NICs assigned.  Three or fewer NICs seems to work without this
> >>> functionality.)
> >>>
> >>> However, by unilaterally increasing maxmem, it introduces two
> >>> problems.
> >>>
> >>> First, now libxl's calculation of the required maxmem during migration
> >>> is broken -- any guest which exercised this functionality will fail on
> >>> migration.  (Guests which have the default number of devices are not
> >>> affected.)
> >>>
> >>> Secondly, it makes it impossible for a higher-level toolstack or
> >>> administer to predict how much memory a VM will actually use, making
> >>> it much more difficult to effectively use all of the memory on a
> >>> machine.
> >>>
> >>> The right solution to the original problem is to figure out a way for
> >>> qemu to take pages from the existing pool of guest memory, rather than
> >>> allocating more pages.
> >>>
> >>> That fix will take more time to develop than we have until the feature
> >>> freeze.  In the mean time, the simplest way to fix the migration issue
> >>> is to revert this change.  That will re-introduce the original bug,
> >>> but it's an unusual corner case; and without migration it isn't fully
> >>> functional yet anyway.
> >>>
> >>> Signed-off-by: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> >>> ---
> >>> I do think this is the right approach, but I'm mainly sending this is
> >>> mainly to open up discussion.
> >>>
> >>> CC: Stefano Stabellini <stefano.stabellini@xxxxxxxxxx>
> >>> CC: Wei Liu <wei.liu2@xxxxxxxxxx>
> >>> CC: Ian Campbell <ian.campbell@xxxxxxxxxx>
> >>> CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> >>
> >> Stefano, Andrew, any comments?
> >>
> >> If we're to do this we need to do it now.
> >>
> >> I think reverting this change in QEMU and relevant changes in libxl
> >> would be the most viable solution to solve this for this release.
> > 
> > Reverting this patch doesn't really solve the problem: instead of
> > breaking on migration when the VM has more than 3 emulated NICs, the VM
> > simply refuses to start in that case. I guess it can be considered a
> > small improvement but certainly not a fix.
> > 
> > Given that the migration issue only happens in an "unusual corner case",
> > are we really in a hurry to revert this commit and go back to the
> > failure to start, even before we actually figure out what the proper fix
> > is?
> 
> I'm in a hurry to go back to a world where qemu doesn't unexpectedly
> allocate more RAM to a guest.  If the real problem with the original
> patch was that it broke migration, we could fix that pretty easily; but
> the real problem (to me) with the original patch is that it
> unpredicatably allocates more memory to a guest that the toolstack
> doesn't know about.

Not only that but in trying to deal with this at least one race/bug has
been added in the libxl code. I suspect there are more.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.