[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] domain creation vs querying free memory (xend and xl)

To: Dan Magenheimer <dan.magenheimer@xxxxxxxxxx>
From: Andres Lagar-Cavilla <andreslc@xxxxxxxxxxxxxx>
Date: Wed, 17 Oct 2012 16:14:57 -0400
Cc: Olaf Hering <olaf@xxxxxxxxx>, "Keir \(Xen.org\)" <keir@xxxxxxx>, Ian Campbell <Ian.Campbell@xxxxxxxxxx>, Konrad Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andres Lagar-Cavilla <andreslc@xxxxxxxxxxxxxx>, "Tim \(Xen.org\)" <tim@xxxxxxx>, xen-devel@xxxxxxxxxxxxx, George Shuklin <george.shuklin@xxxxxxxxx>, Dario Faggioli <raistlin@xxxxxxxx>, Kurt Hackel <kurt.hackel@xxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
Delivery-date: Wed, 17 Oct 2012 20:15:32 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Oct 17, 2012, at 3:46 PM, Dan Magenheimer <dan.magenheimer@xxxxxxxxxx> wrote:

>> From: Andres Lagar-Cavilla [mailto:andreslc@xxxxxxxxxxxxxx]
>> Subject: Re: [Xen-devel] domain creation vs querying free memory (xend and 
>> xl)
> 
> Hi Andres --
> 
> Re reply just sent to George...
> 
> I think you must be on a third planet, revolving somewhere between
> George's and mine.  I say that because I agree completely with some
> of your statements and disagree with the conclusions you draw from
> them! :-)
> 
>> Domains can be cajoled into obedience via the max_pages tweak -- which I 
>> profoundly dislike. If
>> anything we should change the hypervisor to have a "current_allowance" or 
>> similar field with a more
>> obvious meaning. The abuse of max_pages makes me cringe. Not to say I 
>> disagree with its usefulness.
> 
> Me cringes too.  Though I can see from George's view that it makes
> perfect sense.  Since the toolstack always controls exactly how
> much memory is assigned to a domain and since it can cache the
> "original max", current allowance and the hypervisors view of
> max_pages must always be the same.

No. There is room for slack. max_pages (or current_allowance) simply sets an 
upper bound, which if met will trigger the need for memory management 
intervention.

> 
> Only if the hypervisor or the domain or the domain's administrator
> can tweak current memory usage without the knowledge of the
> toolstack (which is closer to my planet) does an issue arise.
> And, to me, that's the foundation of this whole thread.
> 
>> Once you guarantee no "ex machina" entities fudging the view of the memory 
>> the toolstack has, then all
>> known methods can be bounded in terms of their capacity to allocate memory 
>> unsupervised.
>> Note that this implies as well, I don't see the need for a pool of "unshare" 
>> pages. It's all in the
>> heap. The toolstack ensures there is something set apart.
> 
> By "ex machina" do you mean "without the toolstack's knowledge"?
> 
> Then how does page-unsharing work?  Does every page-unshare done by
> the hypervisor require serial notification/permission of the toolstack?

No of course not. But if you want to keep a domain at bay you keep its 
max_pages where you want it to stop growing. And at that point the domain will 
fall asleep (not 100% there hypervisor-wise yet but Real Soon Now (™)), and a 
synchronous notification will be sent to a listener.

At that point it's again a memory management decision. Should I increase the 
domain's reservation, page something out, etc? There is a range of 
possibilities that are not germane to the core issue of enforcing memory limits.

> Or is this "batched", in which case a pool is necessary, isn't it?
> (Not sure what you mean by "no need for a pool" and then "toolstack
> ensures there is something set apart"... what's the difference?)

I am under the impression there is a proposal floating for a 
hypervisor-maintained pool of pages to immediately relief un-sharing. Much like 
there is now for PoD (the pod cache). This is what I think is not necessary.

> 
> My point is, whether there is no pool or a pool that sometimes
> runs dry, are you really going to put the toolstack in the hypervisor's
> path for allocating a page so that the hypervisor can allocate
> a new page for CoW to fulfill an unshare?

Absolutely not.

> 
>> Something that I struggle with here is the notion that we need to extend the 
>> hypervisor for any aspect
>> of the discussion we've had so far. I just don't see that. The toolstack has 
>> (or should definitely
>> have) a non-racy view of the memory of the host. Reservations are therefore 
>> notions the toolstack
>> manages.
> 
> In a perfect world where the toolstack has an oracle for the
> precise time-varying memory requirements for all guests, I
> would agree.

With the mechanism outlined, the toolstack needs to make coarse-grained 
infrequent decisions. There is a possibility for pathological misbehavior -- I 
think there is always that possibility. Correctness is preserved, at worst, 
performance will be hurt.

It's really important to keep things separate in this discussion. The 
toolstack+hypervisor are enabling (1) control over how memory is allocated to 
what (2) control over a domain's ability to grow its footprint unsupervised (3) 
control over a domain's footprint with PV mechanisms from within, or externally.

Performance is not up to the toolstack but to the memory manager magic the 
toolstack enables with (3).

> 
> In that world, there's no need for a CPU scheduler either...
> the toolstack can decide exactly when to assign each VCPU for
> each VM onto each PCPU, and when to stop and reassign.
> And then every PCPU would be maximally utilized, right?
> 
> My point: Why would you resource-manage CPUs differently from
> memory?  The demand of real-world workloads varies dramatically
> for both... don't you want both to be managed dynamically,
> whenever possible?
> 
> If yes (dynamic is good), in order for the toolstack's view of
> memory to be non-racy, doesn't every hypervisor page allocation
> need to be serialized with the toolstack granting notification/permission?

Once you bucketize RAM and know you will get synchronous kicks as buckets fill 
up, then you have a non-racy view. If you choose buckets of width one…..

> 
>> I further think the pod cache could be converted to this model. Why have 
>> specific per-domain lists of
>> cached pages in the hypervisor? Get them back from the heap! Obviously 
>> places a decoupled requirement
>> of certain toolstack features. But allows to throw away a lot of complex 
>> code.
> 
> IIUC in George's (Xapi) model (or using Tim's phrase, "balloon-to-fit")
> the heap is "always" empty because the toolstack has assigned all memory.

I don't think that's what they mean. Nor is it what I mean. The toolstack may 
chunk memory up into abstract buckets. It can certainly assert that its 
bucketized view matches the hypervisor view. Pages flow from the heap to each 
domain -- but the bucket "domain X" will not overflow unsupervised.

> So I'm still confused... where does "page unshare" get memory from
> and how does it notify and/or get permission from the toolstack?

Re sharing, as it should be clear by now, the answer is "it doesn't matter". If 
unsharing cannot be satisfied form the heap, then memory management in dom0 is 
invoked. Heavy-weight, but it means you've hit an admin-imposed limit.

Please note that this notion of limits and enforcement is sparingly applied 
today, to the best of my knowledge. But imho it'd be great to meaningfully work 
towards it.

Andres
> 
>> My two cents for the new iteration
> 
> I'll see your two cents, and raise you a penny! ;-)
> 
> Dan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Dan Magenheimer

References:
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Dan Magenheimer
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Tim Deegan
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Dan Magenheimer
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Tim Deegan
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Dan Magenheimer
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Tim Deegan
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Ian Campbell
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Andres Lagar-Cavilla
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Dan Magenheimer
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: George Dunlap
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Dan Magenheimer
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: George Dunlap
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Dan Magenheimer
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: George Dunlap
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Andres Lagar-Cavilla
- Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
  - From: Dan Magenheimer

Prev by Date: Re: [Xen-devel] CPU microcode update under xen
Next by Date: Re: [Xen-devel] CPU microcode update under xen
Previous by thread: Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
Next by thread: Re: [Xen-devel] domain creation vs querying free memory (xend and xl)
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.