Xen project Mailing List

[Xen-ia64-devel] RE: Xen/ia64 - global or per VP VHPT

To: "Magenheimer, Dan \(HP Labs Fort Collins\)" <dan.magenheimer@xxxxxx>, "Yang, Fred" <fred.yang@xxxxxxxxx>, "Dong, Eddie" <eddie.dong@xxxxxxxxx>

From: "Munoz, Alberto J" <alberto.j.munoz@xxxxxxxxx>

Date: Mon, 2 May 2005 10:27:00 -0700

Cc: ipf-xen <ipf-xen@xxxxxxxxx>, xen-ia64-devel@xxxxxxxxxxxxxxxxxxx

Delivery-date: Mon, 02 May 2005 17:26:37 +0000

List-id: DIscussion of the ia64 port of Xen <xen-ia64-devel.lists.xensource.com>

Thread-index: AcVKfesR741jQGkzQvWmdNAbNvskDgAAPPdQAAnH97AAJJJm8AAGJlmgACr730AABzFgcAAYTImwABTis7AAAN6T8AAIbpPQAAEKV1AAP8dSsAAHW6aAABkvMiAALOu3IA==

Thread-topic: Xen/ia64 - global or per VP VHPT

Hi Dan, I am going to try combine replies for your previous two messages here and then attempt to point out where I think the tradeoffs of the two approaches are. I will then leave it up to the people involved with the implementation to work out things going forward (what I mean here may be clearer after reading the rest of this message), as I don't think I can contribute much there. Magenheimer, Dan (HP Labs Fort Collins) <mailto:dan.magenheimer@xxxxxx> wrote on Sunday, May 01, 2005 11:42 AM: >> Please let's talk about specifics >> and how they >> relate to the issues of: >> >> - Scalability (additional contention in a Global VHPT) > > I see your lock contention argument. Is the contention any > worse for 10 domains contending for a global VHPT than an existing > 10-way SMP (e.g. HP-UX, not virtualized) contending for an lVHPT? When I refer to "contention" here, I am just talking about contention in the VHPT. The answer to this question depends on a number of factors. First, I want to point out that contention related to supporting 10 UP VMs is non-existent with a per domain VHPT implementation. As you point out, with the global VHPT implementation it could be equivalent to the contention suffered by an OS (HP-UX in your example) running on a 10-way SMP system. I tend to guess (I have no measurements) that the contention in the case of a VMM supporting 10 UP VMs could potentially be worse than the contention experienced by a mature OS that supports a 10-way SMP system mainly because of RID allocation issues (this problem is more difficult for a VMM than for a single OS). I do agree however that virtualizing RIDs is the way to manage this problem, and also that when supporting OSs using short format page table the RID allocation problem may also exist in the per VM VHPT implementation (as Matt Chapman pointed out). As an aside, the issues I have with comparing VMM architectures with traditional operating systems are: 1- A VMM and an OS are very different in terms of resource management (and allocation granularity). Something that works well for an OS may not necessarily work well for a VMM. The main reason for my arguing this point is that processes and VMs interact very differently. This may be self evident to some, but I have had long debates with people regarding this point. 2- I expect that VMMs will have to scale way beyond what OSs scale today to help (along with partitioning) address cases in which a single OS cannot scale to the size of a full machine. By the way, in my opinion large machines are much more important/significant/relevant to IPF than to x-86. >> I have not seen this. Would you mind sending me a pointer to >> this. I tend to >> follow these discussions sporadically, so I missed that one email. > > http://lists.xensource.com/archives/html/xen-ia64-devel/2005-04/msg00012.htm l > > Please note this is just a couple week's work (based on experience > from vBlades) so please ask questions rather than shoot > bullets at it. It's definitely a work in progress. OK. I'll try to be gentle :-) > That's the point I was trying to make. Wasteful is not > strong enough though... if you have 64 such domains, all > of memory is used for VHPTs. So I think some mechanism > for growing/shrinking per-domain VHPTs needs to part of the > design or a lot of "utility computing" flexibility is lost. I think the difference between your argument and mine boils down to whether or not having the functionality to grow/shrink the VHPT is necessary/beneficial in all cases (including the global VHPT case). The argument I can offer is that if you want to support dynamic addition/replacement of memory in systems (I tend to think this functionality is more critical for larger systems. I tend to equate larger systems to IPF), having the ability to grow/shrink the VHPT will be important no matter what the VHPT implementation is (global or per domain). By the way, I do agree with Mark Williamson's observation that allocating all memory for the VHPT at boot time may not be wasteful, if there are no VMs to use it anyway (although there is always the question of whether or not that memory could be used for something beyond the VMM... But that is not very relevant to this discussion). These comments came from a different email message, but I think they are relevant to the discussion of shrinking/growing VHPTs D> means Dan, B> means Bert): B>> No, the VHPT does not have to be pinned by a TR. People do B>> it, but it does B>> not have to be that way. D> It doesn't architecturally, but it does practically, right? It does for current OS implementations. This does not mean this is the best thing for a VMM. This is an example of what I mean with my issue number 2 above, regarding comparing OS and VMMs. D> If the the VHPT is greatly fragmented, I'll bet nearly all of D> the performance advantage is gone due to extra misses and/or D> loss of usable entries in the DTLB. The VHPT does not have to be greatly fragmented. If we agree to preallocate memory for VHPTs (as we need to do for the global VHPT case), then we should be able to manage things at a larger granularity than 4K (covering the VHPT may need more than one TLB entry, but not one entry per 4K chunk in the VHPT). > Not to mention the complexity of the psr.ic-off code to handle this... I am not sure there is as much complexity as you think. In any case, I do think it all boils down to whether or not we believe dynamically sized VHPTs will be necessary in the future. D> My points are: D> Growing or shrinking is not necessary for a global VHPT because D> it is scaled to the actual physical memory in the machine D> rather than the sum of the virtual physical memory of N D> domains. D> Preallocation is much easier for the global VHPT because it need D> not grow or shrink (ignoring hot-plug machine memory) nor is D> it proportional to the number of domains. True. My argument here is that I think there are other reasons for wanting to have this functionality (shrinking/growing the VHPT), like the ability to dynamically add/remove memory from a system (I don't think we should ignore this issue, as you suggest). If we consider this with the scalability tradeoffs with the global VHPT, deciding to implement a dynamic VHPT may not be that hard to swallow. D> If the number of domains is dynamic (especially wildly so), D> allocating memory for per-domain VHPTs is going to be painful. D> And if your solution to this is "if it hurts don't do that" D> (meaning don't allow the number of domains to be dynamic or D> the amount of (meta)physical memory to be dynamic), D> I'd consider that a design problem with per-domain VHPT. If we agree to preallocate memory for the VHPT (as is required for the global VHPT case), and replace the requirement to cover the entire VHPT with a single TR with the ability to minimize the number of memory chunks making up the VHPT (this can be done because the memory is preallocated), then this problem can be addressed. The question here is in your example of the wildly dynamic number of VMs, what is better?: - Having the ability to allocate the VHPT memory in one chunk, but having to suffer the overhead of VHPT synchronizing all of those VMs being created and deleted and the memory accesses of the running ones on a single VHPT. Or - Not having the VHPT synchronization overhead, but having to support shrinking/growing VHPTs. >> You keep on making this differentiation between full and paravirtualization >> (but I don't think that is very relevant to what I am saying), please >> explain how in a paravirtualized guest the example I >> presented above of 10 >> UP VMs having to synchronize updates to the VHPT is not an issue. > > You are likely correct. But it is a small matter of coding > to add the synchronization. Then if performance is poor, we > tell system administrators that the the per-domain VHPT > may be preferable on highly-scalable systems -- at the loss > of some flexibility in dynamic domain migration/ballooning. > > And if it turns out that per-domain VHPT works "better" for > ALL workloads, then I will admit I was wrong and pull the > support for global VHPT. But until then it should be left > as an option (for non-VT domains). I agree that the best way to address this type argument is to measure. The main question is whether or not supporting both mechanisms can be done naturally in the same source base without having to tradeoff other important stuff. In any case, I am definitely not in a position to comment on how viable it is to support both mechanisms in the existing implementation and what else may be affected by it. If everyone agrees that doing both implementations in the same source base is feasible and does not adversely affect other stuff, then I have no objection to what you propose. > > Dan Bert

Attachment: smime.p7s
Description: S/MIME cryptographic signature

_______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.