Xen project Mailing List

[Xen-devel] Re: Q about System-wide Memory Management Strategies

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>

From: Joanna Rutkowska <joanna@xxxxxxxxxxxxxxxxxxxxxx>

Date: Wed, 04 Aug 2010 00:33:17 +0200

Cc: xen-devel@xxxxxxxxxxxxxxxxxxx, qubes-devel@xxxxxxxxxxxxxxxx

Delivery-date: Tue, 03 Aug 2010 15:34:18 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On 08/03/10 01:57, Dan Magenheimer wrote: > Hi Joanna -- > > The slides you refer to are over two years old, and there's > been a lot of progress in this area since then. I suggest > you google for "Transcendent Memory" and especially > my presentation at the most recent Xen Summit North America > and/or http://oss.oracle.com/projects/tmem > Thanks Dan. I've been aware of tmem, but I've been skeptical about it for two reasons: it's complex, and seems rather unportable to other OSes, specifically Windows, which is a concern for us, as we plan to support Windows AppVMs in the future in Qubes. (Hhm, is it really unportable? Perhaps one could create pseudo-filesystem driver that would behave like precache, and a pseudo-disk driver that would behave like preswap?) From reading the papers on tmem (the hogs were really cute :), I understand now that the single most important advantage of using tmem vs. just-ballooning is: no memory inertia for needy VMs, correct? I'm tempted to think that this might not be such a big deal for the Qubes-specific types of workload -- after all, if some apps starts slowing down, the user will temporarily stop "operating" them, and let the system recover within a few seconds, when the balloon will return some more memory. Or am I wrong here, and the recovery is not so easy in practice? > Specifically, I now have "selfballooning" built into > the guest kernel... In your latest presentation you mention selfballooning implemented in kernel, rather than via a userland daemon -- any significant benefit of this? I've been thinking of trying selfballooning using 2.6.34-xenlinux kernel with usermode balloond... How to initially provision the VMs in selfballooning, i.e. how to set mem and memmax? I'm tempted to set memmax to the amount of all physical memory minus memory reserved for Dom0, and other service VMs (which would get fixed, small, amount). The rationale behind this is that we don't know what type of tasks the user will end up doing in any given VM, and she might very well end up with something reaaally memory-hungry (sure, we will not let any other VMs to run at the same time in that case, but we should still be able to handle this I think). > I don't see direct ballooning as feasible (certainly without other > guest changes such as cleancache and frontswap). > Why is that? Intuitively it sounds like the most straightforward solution -- only Dom0 can see the system-wide picture of all the VM needs (and priorities). What happens if too many guests would request too much memory, i.e. within their maxmem limits, but such that the overall total exceeds the total available in the system? I guess then whoever was first and lucky would get the memory, but the last ones would get nothing, right? While if we had centrally-managed allocation, we would be able to e.g. scale down the target memory sizes equally, or tell the user that some VMs must be closed for smooth operation of the others (or close them automatically). > Anyway, I have limited availability in the next couple of > weeks but would love to talk (or email) more about > this topic after that (but would welcome clarification > questions in the meantime). > No problem. Hopefully some of the above questions would fall into the "clarification" category :) And maybe others will answer the others :) Thanks, joanna. > Dan > >> -----Original Message----- >> From: Joanna Rutkowska [mailto:joanna@xxxxxxxxxxxxxxxxxxxxxx] >> Sent: Monday, August 02, 2010 3:39 PM >> To: xen-devel@xxxxxxxxxxxxxxxxxxx; Dan Magenheimer >> Cc: qubes-devel@xxxxxxxxxxxxxxxx >> Subject: Q about System-wide Memory Management Strategies >> >> Dan, Xen.org'ers, >> >> I have a few questions regarding strategies for optimal memory >> assignment among VMs (PV DomU and Dom0, all Linux-based). >> >> We've been thinking about implementing a "Direct Ballooning" strategy >> (as described on slide #20 in Dan's slides [1]), i.e. to write a daemon >> that would be running in Dom0 and, based on the statistics provided by >> ballond daemons running in DomUs, to adjust memory assigned to all VMs >> in the system (via xm mem-set). >> >> Rather than trying to maximize the number of VMs we could run at the >> same time, in Qubes OS we are more interested in optimizing user >> experience for running "reasonable number" of VMs (i.e. >> minimizing/eliminating swapping). In other words, given the number of >> VMs that the user feels the need to run at the same time (in practice >> usually between 3-6), and given the amount of RAM in the system (4-6 GB >> in practice today), how to optimally distribute it among the VMs? In >> our >> model we assume the disk backend(s) are in Dom0. >> >> Some specific questions: >> 1) What is the best estimator of the "ideal" amount of RAM each VM >> would >> like to have? Dan mentions [1] the Commited_AS value from >> /proc/meminfo, >> but what about the fs cache? I would expect that we should (ideally) >> allocate Commited_AS + some_cache amount of RAM, no? >> >> 2) What's the best estimator for "minimal reasonable" amount of RAM for >> VM (below which the swapping would kill the performance for good)? The >> rationale behind this, is that if we couldn't allocate "ideal" amount >> of >> RAM (point 1 above), then we would be scaling the available RAM down, >> until this "reasonable minimum" value. Below this, we would display a >> message to the user that they should close some VMs (or will close >> "inactive" one automatically), and also we would refuse to start any >> new >> AppVMs. >> >> 3) Assuming we have enough RAM to satisfy all the VMs' "ideal" >> requests, >> what should we do with the excessive RAM -- options are: >> a) distribute among all the VMs (more per-VM RAM, means larger FS >> caches, means faster I/O), or >> b) assign it to Dom0, where the disk backend is running (larger FS >> cache >> means faster disk backends, means faster I/O in each VM?) >> >> Thanks, >> joanna. >> >> [1] >> http://www.xen.org/files/xensummitboston08/MemoryOvercommit- >> XenSummit2008.pdf >>

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.