Xen project Mailing List

Re: [Xen-devel] Please ack XENMEM_claim_pages hypercall?

Dan Magenheimer writes ("RE: Please ack XENMEM_claim_pages hypercall?"): > From a single-system-xl-toolstack-centric perspective ("paradigm"), > I can see your point. I don't think this is the case. What you are doing is putting this node-specific claim functionality in the hypervisor. I still think it should be done outside the hypervisor. This does not mean that there has to be a single omniscient piece of software for an entire cluster. It just means that there has to be a single omniscient piece of software _for a particular host_. In your proposal that piece of software is the hypervisor - specifically, the part of the hypervisor that does the bookkeeping of these claims. However I am still unconvinced that this can't be implemented outside the hypervisor. In general it is a key design principle of a system like Xen that the hypervisor should provide only those facilities which are strictly necessary. Any functionality which can be reasonably provided outside the hypervisor should be excluded from it. It is this principle which you are running up against. It may be (and it seems likely from what you've said in private email) that some of your existing guests make some assumptions about the semantics of the Xen ballooning memory interface, which might be violated by such a design. Specifically it appears that your design has guests deliberately balloon down further than requested by the toolstack, in a kind of attempt to negotiate with other users of memory on the system by actually releasing and claiming Xen memory. But I don't think it's reasonable to demand that the shared codebase reflect such undocumented prior assumptions. Particularly when this design seems poor. I find it poor because (a) using actual memory allocation and release provides only a very impoverished signalling mechanism (b) it imposes on every part of the system the possibility of unexpected memory allocation failure. (Your claim hypercall is an attempt to mitigate (b) for some but not all cases.) It seems to me that the correct approach is to design and implement a new interface which allows a guest (whether that be its kernel or an agent) to conduct a richer negotiation with out-of-hypervisor toolstack software. But that toolstack software (which might take any particular shape - certainly we don't want to make any assumptions about its policies and nature) needs to be sufficiently aware of the claims and arbitrate between them, in a way that arranges that guests which obey the rules never see an "out of memory" from the hypervisor. Of course if it really is desired to have each guest make its own decisions and simply for them to somehow agree to divvy up the available resources, then even so a new hypervisor mechanism is not needed. All that is needed is a way for those guests to synchronise their accesses and updates to shared records of the available and in-use memory. > AFAIK, Citrix's Dynamic Memory Controller (DMC) in XenServer is > the only shipping example (in the Xen universe) of that, I have never worked on the XenServer codebase and I have no clear idea what this "Dynamic Memory Controller" is. No-one here has explained it to me and I have no particular desire to know about it. Its design, and its requirements or lack of them, have not influenced my opinion on your proposals. As an example to demonstrate that the reaction you are seeing is nothing to do with whether the originators of the proposal are inside our outside Citrix, please refer to our cool reaction to the v4v proposals which also involved new hypervisor functionality. There too, we require the case to be made: we need to be able to see that an out-of-hypervisor approach is not sufficient. > Xen decisions made with this paradigm > in mind heavily favor a single-system model, Nothing in my end of this conversation is predicated on any particular deployment paradigm. It is clear that in both your proposal and my counter-proposal[1] there is a single place in each host where memory allocation decisions are made and in particular where the memory needs of competing guests etc. are arbitrated. In your proposal this place is in the hypervisor and the negotiation between the competing resource users is "grab the memory if you want to". Naturally an in-hypervisor arbitration facility has to be very simple and a sophisticated policy is difficult to apply. In my counter-proposal this negotiation occurs between the guest and an out-of-hypervisor per-host arbitrator of some kind. I think you are going to say that in your system the guests decide for themselves how much memory to claim based on their views of how much is free, and whether their allocations fail. However, there is no particular reason why the information about how much memory is free, and how much has been committed for each purpose, could not be collected somewhere outside the hypervisor. [1] I don't have a detailed counter-proposal design of course, but that's mostly because the information and reasoning you have provided about your objectives and constraints is rather vague. I agree with George that you should consider allowing someone else to have a go at explaining things to us. If this new hypercall is indeed needed then all that is required is a clear and logical explanation of why this is so. I'm sorry to say that your efforts in this direction so far have not been sufficient, and I feel that our attempts to elicit explanations from you have not been as successful as needed. I would love to help Oracle out by solving this problem which is evidently causing a lot of trouble. But it's difficult, and in particular we do seem to be having serious trouble communicating with you. Sorry, Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.