[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen balloon driver improvement (version 1)



On Thu, Oct 23, 2014 at 11:09:19AM +0100, David Vrabel wrote:
> On 22/10/14 17:29, Wei Liu wrote:
> > 
> > ### Make use of balloon page compaction
> [...]
> > The core of migration callback is XENMEM\_exchange hypercall. This
> > makes sure that inflation of old page and deflation of new page is
> > done atomically, so even if a domain is beyond its memory target and
> > being enforced, it can still compact memory.
> 
> XENMEM_exchange doesn't really have the behaviour that is needed here.
> 

Atomicity is guaranteed, isn't it?

> Page migration splits the memory map into two parts, the populated area
> at the bottom and the balloon area.  The populated area is fragmented by
> ballooned pages, and the balloon area is fragmented by populated pages.
> 
> Consider a single ballooned page in the middle of an otherwise intact
> superframe.  Page migration wants to populate this page and depopulate a
> different page from the balloon area.
> 
> A hypercall that can do an atomic populate and depopulate will allow xen
> to easily recreate the superframe (if the missing frame is free).
> XENMEM_exchange will leave the superframe fragmented.
> 

It's true that host superframe is fragmented, but how is it worse than
before? Balloon page compaction is meant to defragment guest address
space.  I think it's acceptable as long as it doesn't make host frame
fragmentation worse.

> XENMEM_exchange would be an acceptable fallback when this new hypercall
> is not availble.
> 

What I'm trying to do here is to build a cycle of balloon compaction /
page coalescence that can converge on both host and guest defragmenting
their address space.

Adding new hypercall is orthogonal to this approach. It might be
more efficient, but it also means to use this guests are tied to new
hypervisor.

Further more, we can probably consider changing XENMEM_exchange to
achieve the functionality you need under the hood without guest
intervention.

> > ### Maintain multiple queues for pages of different sizes and purposes
> > 
> > We maintain multiple queues for pages of different sizes inside Xen
> > balloon driver, so that Xen balloon worker thread can coalesce smaller
> > size pages into one larger size page. Queues for special purposed
> > pages, such as balloon pages used to map foreign pages, are also
> > maintained. These special purposed pages are not subject to migration
> > and page coalescence.
> > 
> > For instance, balloon driver can maintain three queues:
> > 
> > 1. queue for 2 MB pages
> > 1. queue for 4 KB pages (delegated to core balloon driver)
> > 1. queue for pages used to mapped pages from other domain
> > 
> > More queues can be added when necessary, but for now one queue for
> > normal pages and one queue for huge page should be enough.
> 
> Can you explain why is this specific to Xen and why other hypervisors
> wouldn't want to make use of all this huge page infrastructure?
> 

Linux as hypervisor can use huge page infrastructure and page migration.

I think you're taking about balloon page compaction in guest?  As a
guest, it uses balloon compaction. However, the host is capable of doing
page migration all by itself, and if configured, uses THP to back guest
address space, so the balloon driver in guest has less burden, which
means it doesn't have to actively ask hypervisor to back its address
space with huge pages. Xen is less capable in this area.

> > ### Worker thread to coalesce small size pages
> > 
> > Worker thread wakes up periodically to check if there's enough pages
> > in normal size page queue to coalesce into a huge page. If so, it will
> > try to exchange that huge page into a number of normal size pages with
> > XENMEM\_exchange hypercall.
> 
> I don't think you need a new worker thread for this,  the existing page
> migration is already trying to keep the ballooned zone contiguous so
> after migrating pages you need only try and move contiguous ballooned 4k
> pages to the 2M list.
> 

That's an idea. I will take this into consideration.

> > ## Flowcharts
> > 
> > These flowcharts assume normal page size is 4K and huge page size is
> > 2M.  They show how two queues are maintained.
> 
> Having to break 2M pages into 4k ones to meet a target suggests that the
> toolstack should allocate a domain with 2M multiples and should set the
> target in 2M multiples only.  The autoballoon driver will also need to
> set the target in 2M multiples.
> 

This only happens if the amount is multiples of 2M. Otherwise it just
works as before -- use 4K pages.

Wei.

> David

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.