[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Analysis of using balloon page compaction in Xen balloon driver

This document analyses the impact of using balloon compaction
infrastructure in Xen balloon driver.

## Motives

1. Balloon pages fragments guest physical address space.
2. Balloon compaction infrastructure can migrate ballooned pages from
   start of zone to end of zone, hence creating contiguous guest physical
   address space.
3. Having contiguous guest physical address enables some options to
   improve performance.

## Benefit for auto-translated guest

HVM/PVH/ARM guest can have contiguous guest physical address space
after balloon pages are compacted, which potentially improves memory
performance provided guest makes use of huge pages, either via
Hugetlbfs or Transparent Huge Page (THP).

Consider memory access pattern of these guests, one access to guest
physical address involves several accesses to machine memory. The
total number of memory accesses can be represented as:

> X = H1 * G1 + H2 * G2 + ... + Hn * Gn + 1

Hx denotes second stage page table walk levels and Gx denotes guest
page table walk levels.

By having contiguous guest physical address, guest can make use of
huge pages. This can reduce the number of G's in formula.

Reducing number of H's is another project for hypervisor side
improvement and should be decoupled from Linux side changes.

## Design and implementation

The use of balloon compaction doesn't require introducign new
interfaces between Xen balloon driver and the rest of the system. Most
changes are internal to Xen balloon driver.

Currently, Xen balloon driver gets its page directly from page
allocator. To enable balloon page migration, those pages now need to
be allocated from core balloon driver. Pages allocated from core
balloon driver are subject to balloon page compaction.

Xen balloon driver will also need to provide a callback to migrate
balloon page. In essence callback function receives "old page", which
is a already ballooned out page, and "new page", which is a page to be
ballooned out, then it inflates "old page" and deflates "new page".

The core of migration callback is XENMEM\_exchange hypercall. This
makes sure that inflation of old page and deflation of new page is
done atomically, so even if a domain is beyond its memory target and
being enforced, it can still compact memory.

## HAP table fragmentation is not made worse

*Assumption*: guest physical address space is already heavily
fragmented by balloon pages when balloon page compaction is required.

For a typical test case like ballooning up and down when doing kernel
compilation, there's usually only a handful huge pages left in the
end. So the observation matches the assumption. On the other hand, if
guest physical address space is not heavily fragmented, it's not
likely balloon page compaction will be triggered automatically.

In practice, balloon page compaction is not likely to make things
worse. Here is the analysis based on the above assumption.

Note that HAP table is already shattered by balloon pages. When a
guest page is ballooned out, the underlying HAP entry needs to be
split should that entry pointed to a huge page.

XENMEM\_exchange works as followed, "old page" is the guest page about
to get inflated and "new page" is the guest page about to get
deflated. It works like this:

1. Steal old page from domain.
2. Allocate a heap page from domheap
3. Release new page back to Xen
4. Update guest physmap, old page points to heap page, new page points

The end result is that HAP entry for "old page" now points to a valid
MFN instead of having INVALID\_MFN; HAP entry for "new page" now points

So for old page we're in the same position as before. HAP table is
fragmented, however it's not more fragmented than before.

For new page, the risk is that if the targeting guest new page is part
of a huge page, we need to split HAP entry, hence fragmenting HAP
table. This is valid concern. However in practice, guest address space
is already fragmented by ballooning. It's not likely we need to break
up any more huge pages, because there aren't that many left. So we're
in a position no worse than before.

Another downside is that when Xen is exchanging a page, it's possible
that Xen may need to break up a huge page to get a 4K page. Xen
domheap is fragmented. However we're not getting any worse than before
as ballooning already fragments domheap.

## Beyond Linux balloon compaction infrastructure

Currently there's no mechanism in Xen to coalesce HAP table
entries. To coalesce HAP entries we would need to make sure all
discrete entries belong to one huge page, are in correct order and
correct state.

By introducing necessary infrastructure(s) inside hypervisor (page
migration etc.), we might eventually be able to coalesce HAP entries,
hence reducing the number of H's in the aforementioned formula. This,
combined with the work on guest side, can help guest achieve best
possible performance.

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.