Xen project Mailing List

Re: [Xen-devel] initial ballooning amount on HVM+PoD

To: "Boris Ostrovsky" <boris.ostrovsky@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Fri, 17 Jan 2014 16:03:02 +0000

Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>

Delivery-date: Fri, 17 Jan 2014 16:03:25 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 17.01.14 at 16:54, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> wrote: > On 01/17/2014 09:33 AM, Jan Beulich wrote: >> While looking into JÃrgen's issue with PoD setup causing soft lockups >> in Dom0 I realized that what I did in linux-2.6.18-xen.hg's c/s >> 989:a7781c0a3b9a ("xen/balloon: fix balloon driver accounting for >> HVM-with-PoD case") just doesn't work - the BUG_ON() added there >> triggers as soon as there's a reasonable amount of excess memory. >> And that is despite me knowing that I spent significant amounts of >> in testing that change - I must have tested something else than >> finally got checked in, or must have screwed up in some other way. >> Extremely embarrassing... >> >> In the course of finding a proper solution I soon stumbled across >> upstream's c275a57f5e ("xen/balloon: Set balloon's initial state to >> number of existing RAM pages"), and hence went ahead and >> compared three different calculations for initial bs.current_pages: >> >> (a) upstream's (open coding get_num_physpages(), as I did this on >> an older kernel) >> (b) plain old num_physpages (equaling the maximum RAM PFN) >> (c) XENMEM_get_pod_target output (with the hypervisor altered >> to not refuse this for a domain doing it on itself) >> >> The fourth (original) method, using totalram_pages, was already >> known to result in the driver not ballooning down enough, and >> hence setting up the domain for an eventual crash when the PoD >> cache runs empty. >> >> Interestingly, (a) too results in the driver not ballooning down >> enough - there's a gap of exactly as many pages as are marked >> reserved below the 1Mb boundary. Therefore aforementioned >> upstream commit is presumably broken. >> >> Short of a reliable (and ideally architecture independent) way of >> knowing the necessary adjustment value, the next best solution >> (not ballooning down too little, but also not ballooning down much >> more than necessary) turns out to be using the minimum of (b) >> and (c): When the domain only has memory below 4Gb, (b) is >> more precise, whereas in the other cases (c) gets closest. > > I am not sure I understand why (b) would be the right answer for > less-than-4G guests. The reason for c275a57f5e patch was that max_pfn > includes MMIO space (which is not RAM) and thus the driver will > unnecessarily balloon down that much memory. max_pfn/num_physpages isn't that far off for guest with less than 4Gb, the number calculated from the PoD data is a little worse. >> Question now is: Considering that (a) is broken (and hard to fix) >> and (b) is in presumably a large part of practical cases leading to >> too much ballooning down, shouldn't we open up >> XENMEM_get_pod_target for domains to query on themselves? >> Alternatively, can anyone see another way to calculate a >> reasonably precise value? > > I think hypervisor query is a good thing although I don't know whether > exposing PoD-specific data (count and entry_count) to the guest is > necessary. It's probably OK (or we can set these fields to zero for > non-privileged domains). That's pointless then - if no useful data is provided through the call to non-privileged domains, we can as well keep it erroring for them. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.