[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] initial ballooning amount on HVM+PoD
On Fri, 2014-01-17 at 16:03 +0000, Jan Beulich wrote: > >>> On 17.01.14 at 16:54, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> wrote: > > On 01/17/2014 09:33 AM, Jan Beulich wrote: > >> While looking into JÃrgen's issue with PoD setup causing soft lockups > >> in Dom0 I realized that what I did in linux-2.6.18-xen.hg's c/s > >> 989:a7781c0a3b9a ("xen/balloon: fix balloon driver accounting for > >> HVM-with-PoD case") just doesn't work - the BUG_ON() added there > >> triggers as soon as there's a reasonable amount of excess memory. > >> And that is despite me knowing that I spent significant amounts of > >> in testing that change - I must have tested something else than > >> finally got checked in, or must have screwed up in some other way. > >> Extremely embarrassing... > >> > >> In the course of finding a proper solution I soon stumbled across > >> upstream's c275a57f5e ("xen/balloon: Set balloon's initial state to > >> number of existing RAM pages"), and hence went ahead and > >> compared three different calculations for initial bs.current_pages: > >> > >> (a) upstream's (open coding get_num_physpages(), as I did this on > >> an older kernel) > >> (b) plain old num_physpages (equaling the maximum RAM PFN) > >> (c) XENMEM_get_pod_target output (with the hypervisor altered > >> to not refuse this for a domain doing it on itself) > >> > >> The fourth (original) method, using totalram_pages, was already > >> known to result in the driver not ballooning down enough, and > >> hence setting up the domain for an eventual crash when the PoD > >> cache runs empty. > >> > >> Interestingly, (a) too results in the driver not ballooning down > >> enough - there's a gap of exactly as many pages as are marked > >> reserved below the 1Mb boundary. Therefore aforementioned > >> upstream commit is presumably broken. > >> > >> Short of a reliable (and ideally architecture independent) way of > >> knowing the necessary adjustment value, the next best solution > >> (not ballooning down too little, but also not ballooning down much > >> more than necessary) turns out to be using the minimum of (b) > >> and (c): When the domain only has memory below 4Gb, (b) is > >> more precise, whereas in the other cases (c) gets closest. > > > > I am not sure I understand why (b) would be the right answer for > > less-than-4G guests. The reason for c275a57f5e patch was that max_pfn > > includes MMIO space (which is not RAM) and thus the driver will > > unnecessarily balloon down that much memory. > > max_pfn/num_physpages isn't that far off for guest with less than > 4Gb, the number calculated from the PoD data is a little worse. On ARM RAM may not start at 0 and so using max_pfn can be very misleading and in practice causes arm to balloon down to 0 as fast as it can. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |