[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] initial ballooning amount on HVM+PoD
>>> On 17.01.14 at 17:13, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> wrote: > On 01/17/2014 11:03 AM, Jan Beulich wrote: >>>>> On 17.01.14 at 16:54, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx> wrote: >>> On 01/17/2014 09:33 AM, Jan Beulich wrote: >>>> While looking into JÃrgen's issue with PoD setup causing soft lockups >>>> in Dom0 I realized that what I did in linux-2.6.18-xen.hg's c/s >>>> 989:a7781c0a3b9a ("xen/balloon: fix balloon driver accounting for >>>> HVM-with-PoD case") just doesn't work - the BUG_ON() added there >>>> triggers as soon as there's a reasonable amount of excess memory. >>>> And that is despite me knowing that I spent significant amounts of >>>> in testing that change - I must have tested something else than >>>> finally got checked in, or must have screwed up in some other way. >>>> Extremely embarrassing... >>>> >>>> In the course of finding a proper solution I soon stumbled across >>>> upstream's c275a57f5e ("xen/balloon: Set balloon's initial state to >>>> number of existing RAM pages"), and hence went ahead and >>>> compared three different calculations for initial bs.current_pages: >>>> >>>> (a) upstream's (open coding get_num_physpages(), as I did this on >>>> an older kernel) >>>> (b) plain old num_physpages (equaling the maximum RAM PFN) >>>> (c) XENMEM_get_pod_target output (with the hypervisor altered >>>> to not refuse this for a domain doing it on itself) >>>> >>>> The fourth (original) method, using totalram_pages, was already >>>> known to result in the driver not ballooning down enough, and >>>> hence setting up the domain for an eventual crash when the PoD >>>> cache runs empty. >>>> >>>> Interestingly, (a) too results in the driver not ballooning down >>>> enough - there's a gap of exactly as many pages as are marked >>>> reserved below the 1Mb boundary. Therefore aforementioned >>>> upstream commit is presumably broken. >>>> >>>> Short of a reliable (and ideally architecture independent) way of >>>> knowing the necessary adjustment value, the next best solution >>>> (not ballooning down too little, but also not ballooning down much >>>> more than necessary) turns out to be using the minimum of (b) >>>> and (c): When the domain only has memory below 4Gb, (b) is >>>> more precise, whereas in the other cases (c) gets closest. >>> I am not sure I understand why (b) would be the right answer for >>> less-than-4G guests. The reason for c275a57f5e patch was that max_pfn >>> includes MMIO space (which is not RAM) and thus the driver will >>> unnecessarily balloon down that much memory. >> max_pfn/num_physpages isn't that far off for guest with less than >> 4Gb, the number calculated from the PoD data is a little worse. > > For a 4G guest it's 65K pages that are ballooned down so it's not > insignificant. I didn't say (in the original mail) 4Gb guest - I said guest with memory only below 4Gb. So yes, for 4Gb guest this is unacceptably high, ... > And it you are increasing MMIO size (something that we had to do here) > it gets progressively worse. ... and growing with MMIO size, hence the PoD data yields better results in that case. >>>> Question now is: Considering that (a) is broken (and hard to fix) >>>> and (b) is in presumably a large part of practical cases leading to >>>> too much ballooning down, shouldn't we open up >>>> XENMEM_get_pod_target for domains to query on themselves? >>>> Alternatively, can anyone see another way to calculate a >>>> reasonably precise value? >>> I think hypervisor query is a good thing although I don't know whether >>> exposing PoD-specific data (count and entry_count) to the guest is >>> necessary. It's probably OK (or we can set these fields to zero for >>> non-privileged domains). >> That's pointless then - if no useful data is provided through the >> call to non-privileged domains, we can as well keep it erroring for >> them. >> > > I thought that are after d->tot_pages, no? That can be obtained through another XENMEM_ operation. No, what is needed is the difference between PoD entries and PoD cache (which then needs to be added to tot_pages). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |