[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] QEMU bumping memory bug analysis



On 06/08/15 11:37, George Dunlap wrote:
> On 06/08/2015 04:01 PM, Don Slutz wrote:
>> On 06/08/15 10:20, George Dunlap wrote:
>>> And at the moment, pages in the p2m are allocated by a number of entities:
>>> * In the libxc domain builder.
>>> * In the guest balloon driver
>>> * And now, in qemu, to allocate extra memory for virtual ROMs.
>>
>> This is not correct.  QEMU and hvmloader both allocate pages for their
>> use.  LIBXL_MAXMEM_CONSTANT allows QEMU and hvmloader to allocate some
>> pages.  The QEMU change only comes into play after LIBXL_MAXMEM_CONSTANT
>> has been reached.
> 
> Thanks -- so the correct statement here is (in time order):
> 
> Pages in the p2m are allocated by a number of entities:
> * In the libxc domain builder
> * In qemu
> * In hvmloader
> * In the guest balloon driver
> 

That is my understanding.  As Ian C pointed out there is a file:

docs/misc/libxl_memory.txt

That attempts to talk about this.

>>> For the first two, it's libxl that sets maxmem, based in its calculation
>>> of the size of virtual RAM plus various other bits that will be needed.
>>>  Having qemu *also* set maxmem was always the wrong thing to do, IMHO.
>>>
>>
>> It does it for all 3 (4?) because it adds LIBXL_MAXMEM_CONSTANT.
> 
> So the correct statement is:
> 
> In the past, libxl has set maxmem for all of those, based on its
> calculation of virtual RAM plus various other bits that might be needed
> (including pages needed by qemu or hvmloader).
> 
> The change as of qemu $WHATEVER is that now qemu also sets it if it
> finds that libxl didn't give it enough "slack".  That was always the
> wrong thing to do, IMHO.
> 

Ok.

>>> In theory, from the interface perspective, what libxl promises to
>>> provide is virtual RAM.  When you say "memory=8192" in a domain config,
>>> that means (or should mean) 8192MiB of virtual RAM, exclusive of video
>>> RAM, virtual ROMs, and magic pages.  Then when you say "xl mem-set
>>> 4096", it should again be aiming at giving the VM the equivalent of
>>> 4096MiB of virtual RAM, exclusive of video RAM, &c &c.
>>
>>
>> Not what is currently done.  virtual video RAM is subtracted from "memory=".
> 
> Right.
> 
> After I sent this, it occurred to me that there were two sensible
> interpretations of "memory=".  The first is, "This is how much virtual
> RAM to give the guest.  Please allocate non-RAM pages in addition to
> this."  The second is, "This is the total amount of host RAM I want the
> guest to use.  Please take non-RAM pages from this total amount."
> 
> In reality we apparently do neither of these. :-)
> 
> I think both break the "principle of least surprise" in different ways,
> but I suspect that admins on the whole would rather have the second
> interpretation, as I think it makes their lives a bit easier.
> 

Before I knew as much about this as I currently do, I had assumed that
second interpretation was what libxl did.  Normally video RAM is the
largest amount and so the smaller delta (LIBXL_MAXMEM_CONSTANT 1MiB
and LIBXL_HVM_EXTRA_MEMORY 2MiB) just was not noticed.

There is also shadow memory, which needs to be in the above.

>>> We already have the problem that the balloon driver at the moment
>>> doesn't actually know how big the guest RAM is, nor , but is being told
>>> to make a balloon exactly big enough to bring the total RAM down to a
>>> specific target.
>>>
>>> I think we do need to have some place in the middle that actually knows
>>> how much memory is actually needed for the different sub-systems, so it
>>> can calculate and set maxmem appropriately.  libxl is the obvious place.
>>
>> Maybe.  So you want libxl to know the detail of balloon overhead?  How
>> about the different sizes of all possible Option ROMs in all QEMU
>> version?  What about hvmloader usage of memory?
> 
> I'm not sure what you mean by "balloon overhead", but if you mean "guest
> pages wasted keeping track of pages which have been ballooned out", then
> no, that's not what I mean.  Neither libxl nor the balloon driver keep
> track of that at the moment.
> 

I was trying to refer to:

NOTE: Because of the way ballooning works, the guest has to allocate
memory to keep track of maxmem pages, regardless of how much memory it
actually has available to it.  A guest with maxmem=262144 and
memory=8096 will report significantly less memory available for use than
a system with maxmem=8096 memory=8096 due to the memory overhead of
having to track the unused pages.

(from xl.cfg man page).

> I think that qemu needs to tell libxl how much memory it is using for
> all of its needs -- including option ROMs.  (See my example below.)  For
> older qemus we can just make some assumptions like we always have.
> 

I am happy with this.  Note: I think libxl could determine this number
now without QEMU changes.  However it does depend on no other thread
changing a "staring" domain's memory while libxl is calculating this.

> I do think it would make sense to have the hvmloader amount listed
> somewhere explicitly.  I'm not sure how often hvmloader may need to
> change the amount it uses for itself.
> 

hvmloader does yet a different method.  If
xc_domain_populate_physmap_exact() fails, it reduces guest RAM (if my
memory is correct).

>>> What about this:
>>> * Libxl has a maximum amount of RAM that qemu is *allowed* to use to set
>>> up virtual ROMs, video ram for virtual devices, &c
>>> * At start-of-day, it sets maxpages to PAGES(virtual RAM)+PAGES(magic) +
>>> max_qemu_pages
>>> * Qemu allocates as many pages as it needs for option ROMS, and writes
>>> the amount that it actually did use into a special node in xenstore.
>>> * When the domain is unpaused, libxl will set maxpages to PAGES(virtual
>>> RAM) + PAGES(magic) + actual_qemu_pages that it gets from xenstore.
>>>
>>
>> I think this does match What Wei Liu said:
> 
> The suggestion you quote below is that the *user* should have to put in
> some number in the config file, not that qemu should write the number
> into xenstore.
> 
> The key distinction of my suggestion was to set maxpages purposely high,
> wait for qemu to use what it needs, then to reduce it down to what was
> needed.
> 

Sorry, I did not get that.

   -Don Slutz

>  -George
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.