[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Future support of 5-level paging in Xen
On 12/08/2016 07:20 PM, Andrew Cooper wrote: On 08/12/2016 23:40, Boris Ostrovsky wrote:Of course even the largest virtual machine today (2TB on Amazon AFAIK) is not close to reaching the current memory limit, but it's just a matter of time./me things Oracle will have something to say about this. I'm sure there was talk about VMs larger than this at previous hackathons. XenServer functions (ish, so long as you don't migrate) with 6TB VMs, although starting and shutting them down feels like treacle.I've been working (on and off) with SGI to get one of their 32TB boxes to boot and I don't think that works. We've fixed a couple of bugs but I don't think Xen can boot with that much memory. We successfully booted with just under 8TB but couldn't do it with the full system. The machine has been taken from us for now so this work is on hold. This is on OVM, which is 4.4-based, we haven't tried (IIRC) latest bits.Because 64bit PV guests get 97% of the virtual address space, Xen hits highmem/lowmem problems at the 5TB boundary, which is where we run out of virtual address space for the directmap. Xen supports up to 16TB of RAM (32bits in struct page_info, for a total of 44 bits of mfns), although last time I checked Xen was still unstable if there was any RAM above the 5 TB boundary. Jan did subsequently find and fix an off-by-one error, and I haven't had occasion to re-test since. If you enable CONFIG_BIGMEM (newer than 4.4 I think, but I don't And apparently we don't have that in the OVM version I am looking at. But I'll try the upstream bits when we get a chance to get on this box. actually recall), Xen's virtual layout changes. The directmap shrinks to just 3.5TB, to make space for a frametable containing larger struct page_info's with 64bit indicies. This has a total supported limit of 123TB of RAM, due to virtual range allocated to the frametable. When I observed this going wrong, it went wrong because alloc_xenheap_page() handed back virtual addresses which creep into the 64bit PV kernels ABI range. These virtual addresses are safe for Xen to use in idle and hvm contexts, but not in PV context.(BTW, speaking of slow starting and shutting down very large guests --- have you or anyone else had a chance to look at this? My investigation initially pointed to scrubbing and then to an insane number of hypercall preemptions in relinquish_memory()).This is another item I meant to re-engage on. (Its on my todo list, along with CPUID and nested virt, but looks like it is depending on my whishlist item of several extra hours in the day to get some of the work done in.) Yes. We should do something towards fixing that. Current performance measurements put a 1.5TB domain at ~14 minutes for the domain_kill hypercall to complete. I seem to recall some vague plans towards having per-node dirty-page lists, scrubbing in idle context, and on-demand scrubbing at alloc-time if the clean list is empty. I have this (almost) working but then I found that the hypercall preemption was eating even more time than scrubbing and got distracted by that. And then by other things (I have attention span of a squirrel) -boris _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |