[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] query memory allocation per NUMA node

Hello Dario,

On Thu, Jan 12, 2017 at 8:33 AM Dario Faggioli <dario.faggioli@xxxxxxxxxx> wrote:
On Mon, 2017-01-09 at 14:01 +0000, Kun Cheng wrote:
> I haven't been using NUMA things in recent years, so my intel may not
> be correct.
> Actually, I think it's quite difficult to retrieve such info through
> a command, as Xen only provide some numa placement & scheduling
> (load-balancing) support (and vNuma feature, maybe it's still
> experimental but last time I tried it, it was functional). From my
> understanding, probing memory allocation would be difficult as such
> things are dynamic, or maybe it is just not worthy of the effort.
Things are not at all dynamic. It's actually a matter of storing the
info somewhere in Xen (so we don't have to scan all the pages of a
domain all the times), and plumb that up until to xl.

It's not difficult, and it would be well worth the effort. Problem is
finding the time do actually do it. :-)

> Reasons are:
> First numa placement tries to allocate as much as (in most cases Xen
> will find a node which can fit the VM's memory requirement) memory to
> local nodes (let's say 4 vcpus are pinned to node 0 then it's a local
> node), but it seems xen doesn't care how much memory has been
> allocated to a certain VM under such situations (as it tries to
> allocate as much as possible on one node, assuming if a VM's VCPUs
> are spread among several nodes, rare but possible).
I lost you. As you say, first of all, placement algorithm determines a
set of NUMA nodes. It may be one or more nodes, depending on the actual

Then, memory is distributed among the nodes that are part of that set
roughly evenly.

That's what happens.

> As having 800MB on node 0 is pretty much the same as 900MB on node 0
> if your VM requires 1GB, both will have a similar performance impact
> on your VM.
Lost you again. 800 or 900 MB on node 0, and where's the rest? What was
the output of the automatic placement?

OK. What I wanted to say was considering an example, where we have a new VM requiring 1GB memory but Xen couldn't find a suitable node due to heavy load. Then perhaps hypervisor was going to allocate the memory among two or more nodes (let's say it's node 0 & 1 here). In such a case, the performance impact of having 800 or 900MB allocated on node 0 for that VM was alomost the same. As much as I understands, both would cause performance drop comparied to placing that VM on one node, then it's just a matter of how much the figure (drop) is.  
I just wanted to use this example to indicate that once you distributed memory on multiple nodes then it would cause performance drop no matter how you optimized the distribution.

> Second, a VM can be migrated to other nodes due to load balancing,
> which may makes it harder to count how much memory has been allocated
> for a certain VM on each node.
No, it can't. And if it could, updating the counters of how many pages
are moved between nodes wouldn't be difficult at all (while,
unfortunately, other things are, which is why, as I said, that's not
possible yet).

I remembered, in credit2 Xen would only migrate the VCPUs rather than allocated memory for a VM. I mixed it up with the previous optimization I wanted to do after I wrote to your a year ago (I thought that could laed to a situation where VCPUs and memory are on different nodes). At that time I wanted to migrate the memory together with VCPUs in an elegent way (not just moving memory or hot memory immediately ater each time VCPUs' migration). Sorry for the misleading part.   

> If you can't find useful info in Xenstore, then perhaps such feature
> you required is not yet available.
# xl debug-key u
# xl dmesg | tail -20

It's ugly, but gets the job done (and I think it's in the mentioned
wiki page).

> However, if you just want to know the memory usage on each node,
> perhaps you could try numactl and get some outputs? Or try libvirt? I
> remember numastat can give some intel about memory usage on each
> node.
None of that would work (and, this time, not because of missing pieces,
but by design). Well, making it possible to retreive the info via
libvirt would be nice, and it will follow enabling it in libxl and xl.

<<This happens because I choose it to happen!>> (Raistlin Majere)
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)
Kun Cheng
Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.