[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] VM Migration on a NUMA server?

On Fri, 2015-07-31 at 02:32 +0000, Kun Cheng wrote:
> Hi all,
> I'm sorry for taking your time and I'd like to make an enquery about
> the status of VM migration support on a NUMA server. 
Status is: it's not there, and won't happen soon. I've started working
on it, but then got preempted by other issues, and concentrated on
making Xen do the best possible _without_ moving the memory (e.g., with
NUMA-aware scheduling, now achieved through per-vcpu soft affinities).

Moving memory around is really really tricky. It's probably at least
doable for HVM guests, while, for PV, I'm not even so sure it can be
done! :-/

> Currently It looks like when a vm is migrated only its vcpus are moved
> to the other node but not its memory. So, is anyone trying to fix that
> issue? 
What do you mean with "when a vm is migrated"? If soft affinity for a VM
is specified in the config file (or afterwards, but I'd recommend to do
it in the config file, if you're interested in NUMA effects), memory is
allocated from the NUMA node that such affinity spans, and the Xen
scheduler (provided you're using Credit1, our default scheduler), will
try as hard as it can to schedule the vcpus of the vm on the pcpus of
that same node (or set of nodes).

If it's not possible, because all those pcpus are busy, the vcpus are
allowed to run on some other pcpu, outside of the NUMA node(s) the vm
has affinity with, on the basis that some execution, even with slow
memory access, is better than no execution at all.

If you're interested in having the vcpus of the vm _only_ running on the
pcpus of the node to which the memory is attached, I'd suggest using
hard affinity, instead than soft (still specifying it in the config

Support for soft affinity in Credit2 is being worked on. For other
schedulers (ARINC and RTDS), it's not that useful.

> If I want to do it myself, it seems like two major problems are ahead
> of me:
> 1) How to specify the target node for memory migration? I'll be
> grateful if anyone can give me  some hints.
Not sure I get. In my mind, if we will have this in place at some point,
migration will happen, either:
 - automatically, upon load balancing considerations
 - manually, with dedicated libxl interfaces and xl command

at that point, for the latter case, there will be a way of specifying a
target node, and that will most likely be an integer, or a list of

> 2) Memory Migration. Looks like it can be done by leveraging the
> existing migration related functions on Xen.
Mmmm... Maybe I see what you mean now. So, you want to perform a local
migration, and use that as a way of actually moving the guest to another
node, is this correct? If yes, it did work, last time I checked.

If doing this like that, it's true that you don't have any way for
specifying a target node. Therefore, what happens is, either:
 - if no soft or hard affinity is specified in the config file, the
   automatic NUMA placement code will run, and it most likely will
   choose a different node for the target vm, but not in a way that you
   can control easily
 - if any affinity is set, the vm will be re-created in the same exact 

That is why, a way to workaround this, and actually use local migration
as a memory-migration mechanism, is to leverage `xl config-update'. In
fact, you can do as follows:

# xl create vm.cfg 'cpus_soft="node:1'"
# xl config-update <domid> 'cpus_soft="node:0"'
# <do a local migration>

As I said, this all worked last time I tried... Is it not working for
you? Or was it something else you were after?


<<This happens because I choose it to happen!>> (Raistlin Majere)
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.