Xen project Mailing List

[Xen-devel] [RFC][PATCH] 0/9 Populate-on-demand memory

To: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>

From: "George Dunlap" <dunlapg@xxxxxxxxx>

Date: Tue, 23 Dec 2008 12:55:10 +0000

Delivery-date: Tue, 23 Dec 2008 04:55:36 -0800

Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:mime-version:content-type :content-transfer-encoding:content-disposition:x-google-sender-auth; b=biIPpmjegtelDjXDOFnTtbenQHH+xKuzttoAcBkskkftVdTzfOmUXwKLLQTF7TZhZ0 nqWDp/nOyp8iC+w8kChGJNIEeVgRTnYx7PEPSJigil4LwgX2PWBefk4y5gXZz1gX2qhv WxsrZfXeGnCcXx+4txdRqD4G//7wrtvWwf9sQ=

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

This set of patches introduces a set of mechanisms and interfaces to implement populate-on-demand memory. The purpose of populate-on-demand memory is to allow non-paravirtualized guests (such as Windows or Linux HVM) boot in a ballooned state. BACKGROUND When non-PV domains boots, they typically read the e820 maps to determine how much memory they have, and then assume that much memory thereafter. Memory requirements can be reduced using a balloon driver, but it cannot be increased past this initial value. Currently, this means that a non-PV domain must be booted with the maximum amount of memory you want that VM every to be able to use. Populate-on-demand allows us to "boot ballooned", in the following manner: * Mark the entire range of memory (memory_static_max aka maxmem) with a new p2m type, populate_on_demand, reporting memory_static_max in th e820 map. No memory is allocated at this stage. * Allocate the "memory_dynamic_max" (aka "target") amount of memory for a "PoD cache". This memory is kept on a separate list in the domain struct. * Boot the guest. * Populate the p2m table on-demand as it's accessed with pages from the PoD cache. * When the balloon driver loads, it inflates the balloon size to (maxmem - target), giving the memory back to Xen. When this is accomplished, the "populate-on-demand" portion of boot is effectively finished. One complication is that many operating systems have start-of-day page scrubbers, which touch all of memory to zero it. This scrubber may run before the balloon driver can return memory to Xen. These zeroed pages, however, don't contain any information; we can safely replace them with PoD entries again. So when we run out of PoD cache, we do an "emergency sweep" to look for zero pages we can reclaim for the populate-on-demand cache. When we find a page range which is entirely zero, we mark the gfn range PoD again, and put the memory back into the PoD cache. NB that this code is designed to work only in conjunction with a balloon driver. If the balloon driver is not loaded, eventually all pages will be dirtied (non-zero), the emergency sweep will fail, and there will be no memory to back outstanding PoD pages. When this happens, the domain will crash. The code works for both shadow mode and HAP mode; it has been tested with NPT/RVI and shadow, but not yet with EPT. It also attempts to avoid splintering superpages, to allow HAP to function more effectively. To use: * ensure that you have a functioning balloon driver in the guest (e.g., xen_balloon.ko for Linux HVM guests). * Set maxmem/memory_static_max to one value, and memory/memory_dynamic_max to another when creating the domain; e.g: # xm create debian-hvm maxmem=512 memory=256 The patches are as follows: 01 - Add a p2m_query_type to core gfn_to_mfn*() functions. 02 - Change some gfn_to_mfn() calls to gfn_to_mfn_query(), which will not populate PoD entries. Specifically, since gfn_to_mfn() may grab the p2m lock, it must not be called while the shadow lock is held. 03 - Populate-on-demand core. Introduce new p2m type, PoD cache structures, and core functionality. Add PoD checking to audit_p2m(). Add PoD information to the 'q' debug key. 04 - Implement p2m_decrease_reservation. As the balloon driver returns gfns to Xen, it handles PoD entries properly; it also "steals" memory being returned for the PoD cache instead of freeing it, if necessary. 05 - emergency sweep: Implement emergency sweep for zero memory if the cache is low. If it finds pages (or page ranges) entirely zero, it will replace the entry with a PoD entry again, reclaiming the memory for the PoD cache. 06 - Deal with splintering both PoD pages (to back singleton PoD entries) and PoD ranges 07 - Xen interface for populate-on-demand functionality: PoD flag for populate_physmap, {get,set}_pod_target for interacting with the PoD cache. set_pod_target() should be called for any domain that may have PoD entries. It will increase the size of the cache if necessary, but will never decrease the size of the cache. (This will be done as the balloon driver balloons down.) 08 - libxc interface. Add a new libxc functions: + xc_hvm_build_target_mem(), which accepts memsize and target. If these are equal, PoD functionality is not invoked. Otherwise, memsize is marked PoD, and the target MiB is allocated to the PoD cache. + xc_[sg]et_pod_target(): get / set PoD target. set_pod_target() should be called whenever you change the guest target mem on a domain which may have outstaning PoD entries. This may increase the size of the PoD cache up to the number of outstanding PoD entries, but will not reduce the size of the cache. (The cache may be reduced as the balloon driver returns gfn space to Xen.) 09 - xend integration. + Always calls xc_hvm_build_target_mem() with memsize=maxmem and target=memory. If these the same, the internal function will not use PoD. + Calls xc_set_target_mem() whenever a domain's target is changed. Also calls balloon.free(), causing dom0 to balloon down itself if there's not enough memory otherwise. Things still to do: * When reduce_reservation() is called with a superpage, keep the superpage intact. * Create a hypercall continuation for set_pod_target. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.