[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] NUMA-aware VM placement in Xen

On Fri, Feb 24, 2012 at 10:50 AM, Dario Faggioli <raistlin@xxxxxxxx> wrote:
>> It seems to me we
>> have two options:
>> * Have libxl do the NUMA placement on behalf of the toolstack.  In that
>> case, the libxl_domain_create_new function should look at the available
>> memory, the NUMA layout, &c, and then set d->node_affinity before
>> calling xc_hvm_build.
> This can be done. If I got it correctly it is more or less what xm/xend
> already does.
>> * Have the toolstack do it.  In this case, you'd be modifying xl to set
>> d->node_affinity before calling libxl's domain creation function.
> I'm not sure I'm getting this right... It seems very similar to the one
> above.

>From Xen's perspective, yes.  But from the libxl perspective, no.
libxl is meant to be the interface we give to other external
toolstacks, so the interface there is important.

>> Do those options work?  Let me know if I've misunderstood anything.
> I think they can be implemented. "work", it depends on how we define
> "work". :-D
> That's why I was struggling for putting this in the hypervisor and not
> in the toolstack because I really think it should live there if
> possible. For example it would be nice for the decision to be protected
> by the proper locking. I mean, what's the point in checking the amount
> of free memory in a node somewhere in (lib)xl, if when the actual
> allocation will happen (in Xen) that might be a completely different
> value (due to concurrent domain creation, destruction, etc.)?

At the moment, pages for a VM are not allocated in one big chunk
anyway -- xc_hvm_build.c:setup_guest() calls
xc_domain_populate_physmap() in a loop.  So Xen is in less of a
position to avoid the TOCTTOU race than the toolstack is.  The
toolstack in theory, at least, can refrain from starting a second VM
until the first is completely allocated.

I don't think there's any reason to do it in Xen -- it's not
time-critical, it doesn't require any information that the toolstack
and/or domain builder wouldn't have available to it.

> I think the config file, supporting cpupools and vcpu-pinning, already
> offer almost all the facilities for manually deploying a VM reflecting a
> specific NUMA-layout. What I was thinking adding was the "numa=auto" or
> whatever switch, so that if one does not (want to) specify cpupools or
> pinning, VM still gets NUMA-sensible placement.

Do we have a way to specify the NUMA layout?  That should be a
separate config option than vcpu pinning.  But yes, I think adding
"numa=auto" (on by default) is the big feature we need; and I think
that's probably best implemented in either the toolstack or libxl.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.