Xen project Mailing List

[Xen-devel] RE: [RFC] NUMA support

To: "Andre Przywara" <andre.przywara@xxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>

From: "Duan, Ronghui" <ronghui.duan@xxxxxxxxx>

Date: Sat, 24 Nov 2007 23:57:31 +0800

Cc: "Xu, Anthony" <anthony.xu@xxxxxxxxx>

Delivery-date: Sat, 24 Nov 2007 07:58:09 -0800

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Thread-index: Acgt3G3JfGm5ZGhgRCGBOLd14Zbw0wAyrjvw

Thread-topic: [RFC] NUMA support

Hi all, >thanks Ronghui for your patches and ideas. To make a more structured >approach to a better NUMA support, I suggest to concentrate on >one-node-guests first: That is exactly what we want do at first, don't support guest Numa. >* introduce CPU affinity to memory allocation routines called from Dom0. >This is basically my patch 2/4 from August. We should think about using >a NUMA node number instead of a physical CPU, is there something to be >said against this? I think it is reasonable to bind guest with node not CPU. >* find _some_ method of load balancing when creating guests. The method >1 from Ronghui is a start, but a real decision based on each node's >utilization (or free memory) would be more reasonable. Yes, it is only a start for balancing. >* patch the guest memory allocation routines to allocate memory from >that specific node only (based on my patch 3/4) Considering the performance, we should do it. >* use live migration to local host to allow node migration. Assuming >that localhost live migration works reliably (is that really true?) it >shouldn't be too hard to implement this (basically just using node >affinity while allocating guest memory). Since this is a rather >expensive operation (takes twice the memory temporarily and quite some >time), I'd suggest to trigger that explicitly from the admin via a xm >command, maybe as an addition to migrate: ># xm migrate --live --node 1 <domid> localhost >There could be some Dom0 daemon based re-balancer to do this somewhat >automatically later on. > >I would take care of the memory allocation patch and would look into >node migration. It would be great if Roughui or Anthony would help to >improve the "load balancing" algorithm. I have no idea on this now. >Meanwhile I will continue to patch that d*** Linux kernel to accept both >CONFIG_NUMA and CONFIG_XEN without crashing that early ;-), this should >allow both HVM and PV guests to support multiple NUMA nodes within one >guest. > >Also we should start a discussion on the config file options to add: >Shall we use "numanodes=<nr of nodes>", something like "numa=on" (for >one-node-guests only), or something like "numanode=0,1" to explicitly >specify certain nodes? Because now we don't support guest Numa, this configure options we don't need now. If need to support guest Numa, I think users may even want to configure the node's type, i.e. how many Cpu or memory in that node. I think it will be too complicated. ^_^ >Any comments are appreciated. > >> I read your patches and Anthony's commands. Write a patch based on >> >> 1: If guest set numanodes=n (default it will be 1 means that this >> guest will be restricted in one node); hypervisor will choose >> begin node to pin for this guest use round robin. But the method I use >> need a spin_lock to prevent create domain at same time. Are there any >> more good methods, hope for your suggestion. >That's a good start, thank you. Maybe Keir has some comments on the >spinlock issue. >> 2: pass node parameter use higher bits in flags when create domain. >> At this time, domain can record node information in domain struct >> for further use, i.e. show which node to pin when setup_guest. >> If use this method, in your patch, can simply balance nodes just >> like below; >> >>> + for (i=0;i<=dominfo.max_vcpu_id;i++) >>> + { >>> + node= ( i * numanodes ) / (dominfo.max_vcpu_id+1)+ >>> + domaininfo.first_node; >>> + xc_vcpu_setaffinity (xc_handle, dom, i, nodemasks[node]); >>> + } >How many bits do you want to use? Maybe it's not a good idea to abuse >some variable to hold a limited number of nodes only ("640K ought to be >enough for anybody" ;-) But the general idea is good. Actually if no need to support guest Numa, no parameter need to pass down. Seems that one node for guest is a good method. ^_^ Best regards, Ronghui _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.