[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Please ack XENMEM_claim_pages hypercall?



Dan Magenheimer writes ("RE: Please ack XENMEM_claim_pages hypercall?"):
> From a single-system-xl-toolstack-centric perspective ("paradigm"),
> I can see your point.

I don't think this is the case.  What you are doing is putting this
node-specific claim functionality in the hypervisor.  I still think it
should be done outside the hypervisor.  This does not mean that there
has to be a single omniscient piece of software for an entire
cluster.

It just means that there has to be a single omniscient piece of
software _for a particular host_.  In your proposal that piece of
software is the hypervisor - specifically, the part of the hypervisor
that does the bookkeeping of these claims.  However I am still
unconvinced that this can't be implemented outside the hypervisor.


In general it is a key design principle of a system like Xen that the
hypervisor should provide only those facilities which are strictly
necessary.  Any functionality which can be reasonably provided outside
the hypervisor should be excluded from it.  It is this principle which
you are running up against.


It may be (and it seems likely from what you've said in private email)
that some of your existing guests make some assumptions about the
semantics of the Xen ballooning memory interface, which might be
violated by such a design.  Specifically it appears that your design
has guests deliberately balloon down further than requested by the
toolstack, in a kind of attempt to negotiate with other users of
memory on the system by actually releasing and claiming Xen memory.

But I don't think it's reasonable to demand that the shared codebase
reflect such undocumented prior assumptions.  Particularly when this
design seems poor.  I find it poor because (a) using actual memory
allocation and release provides only a very impoverished signalling
mechanism (b) it imposes on every part of the system the possibility
of unexpected memory allocation failure.  (Your claim hypercall is an
attempt to mitigate (b) for some but not all cases.)

It seems to me that the correct approach is to design and implement a
new interface which allows a guest (whether that be its kernel or an
agent) to conduct a richer negotiation with out-of-hypervisor
toolstack software.  But that toolstack software (which might take any
particular shape - certainly we don't want to make any assumptions
about its policies and nature) needs to be sufficiently aware of the
claims and arbitrate between them, in a way that arranges that guests
which obey the rules never see an "out of memory" from the hypervisor.

Of course if it really is desired to have each guest make its own
decisions and simply for them to somehow agree to divvy up the
available resources, then even so a new hypervisor mechanism is not
needed.  All that is needed is a way for those guests to synchronise
their accesses and updates to shared records of the available and
in-use memory.


> AFAIK, Citrix's Dynamic Memory Controller (DMC) in XenServer is
> the only shipping example (in the Xen universe) of that,

I have never worked on the XenServer codebase and I have no clear idea
what this "Dynamic Memory Controller" is.  No-one here has explained
it to me and I have no particular desire to know about it.  Its
design, and its requirements or lack of them, have not influenced my
opinion on your proposals.

As an example to demonstrate that the reaction you are seeing is
nothing to do with whether the originators of the proposal are inside
our outside Citrix, please refer to our cool reaction to the v4v
proposals which also involved new hypervisor functionality.  There
too, we require the case to be made: we need to be able to see that an
out-of-hypervisor approach is not sufficient.


>    Xen decisions made with this paradigm
> in mind heavily favor a single-system model,

Nothing in my end of this conversation is predicated on any particular
deployment paradigm.  It is clear that in both your proposal and my
counter-proposal[1] there is a single place in each host where memory
allocation decisions are made and in particular where the memory needs
of competing guests etc. are arbitrated.

In your proposal this place is in the hypervisor and the negotiation
between the competing resource users is "grab the memory if you want
to".  Naturally an in-hypervisor arbitration facility has to be very
simple and a sophisticated policy is difficult to apply.

In my counter-proposal this negotiation occurs between the guest and
an out-of-hypervisor per-host arbitrator of some kind.

I think you are going to say that in your system the guests decide for
themselves how much memory to claim based on their views of how much
is free, and whether their allocations fail.  However, there is no
particular reason why the information about how much memory is free,
and how much has been committed for each purpose, could not be
collected somewhere outside the hypervisor.

[1] I don't have a detailed counter-proposal design of course, but
that's mostly because the information and reasoning you have provided
about your objectives and constraints is rather vague.


I agree with George that you should consider allowing someone else to
have a go at explaining things to us.  If this new hypercall is indeed
needed then all that is required is a clear and logical explanation of
why this is so.  I'm sorry to say that your efforts in this direction
so far have not been sufficient, and I feel that our attempts to
elicit explanations from you have not been as successful as needed.

I would love to help Oracle out by solving this problem which is
evidently causing a lot of trouble.  But it's difficult, and in
particular we do seem to be having serious trouble communicating with
you.


Sorry,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.