[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] query memory allocation per NUMA node

On Mon, Jan 9, 2017 at 10:47 PM Eike Waldt <waldt@xxxxxxxxxxxxx> wrote:
On 01/09/2017 03:01 PM, Kun Cheng wrote:
> I haven't been using NUMA things in recent years, so my intel may not be
> correct.
> Actually, I think it's quite difficult to retrieve such info through a
> command, as Xen only provide some numa placement & scheduling
> (load-balancing) support (and vNuma feature, maybe it's still
> experimental but last time I tried it, it was functional). From my
> understanding, probing memory allocation would be difficult as such
> things are dynamic, or maybe it is just not worthy of the effort.
> Reasons are:
> First numa placement tries to allocate as much as (in most cases Xen
> will find a node which can fit the VM's memory requirement) memory to
> local nodes (let's say 4 vcpus are pinned to node 0 then it's a local
> node), but it seems xen doesn't care how much memory has been allocated
> to a certain VM under such situations (as it tries to allocate as much
> as possible on one node, assuming if a VM's VCPUs are spread among
> several nodes, rare but possible). As having 800MB on node 0 is pretty
> much the same as 900MB on node 0 if your VM requires 1GB, both will have
> a similar performance impact on your VM.

Xen has to have a mechanism to get to know which NUMA-Node is
most-empty/preferred then.
I even read about different "NUMA placement policies" in [1], but didn't
find a way to set them.

The placement seems to be automatic (perhaps in a greedy way) .

A command line parameter for "xl" is what I'm looking here for.
A handy alternative to "xl debug-keys u; xl dmesg"...

Then as I said, perhaps no cmdline tool could do the trick.


> Second, a VM can be migrated to other nodes due to load balancing, which
> may makes it harder to count how much memory has been allocated for a
> certain VM on each node.

Why should it be harder to count then? "xl debug-keys u; xm dmesg" does
already give me this information (but you cannot really parse this or
execute this periodically).

Becuase it is changing rapidly in some situations (high load) due to load balancing then hypervisor may have to migrate some VCPUs (then memory) to another node. About 14 months ago I asked Wei Liu and  Dario about the possiblility of optimization NUMA support (mainly VCPU & memory load-balancing & migration) and at that time they said it was not in their coming plan. So I tried to it myself for fun but the first problem I encountered was counting how much memory consumed on each node for each VM, which a more accurate NUMA scheduling & load-balancing could depend on such info. However, as in a high load situation (that's where you could use a good laod balancing), memory usage is changing rapidly, at some time you may find a more suitable node with enough memory space but after probing is finished you may find it no longer a candidate due to 1) some VMs have been already migrated to that node 2) memory ballooning of existing VMs on that node.  

My point is such things could be highly dynamic (or I've been overthinking about this problem). But for your situation there should be another way, as I understand that you just want to know the memory usage on each node for VM placement.

Or maybe it's just not worthy of the effort........... even if one can come up with a better scheduling & load balancing, it would be better to try not move the VMs and it's memory due to the principle of locality rather than do it in a more accurate way which could migrate VMs more often.

When I understood it correctly, xen decides on which NUMA Node the DomU
shall run and allocates the needed memory...After that it does a
"soft-pinning" of the DomU's vCPUs to pCPUs (at least that is what i
observed on my test systems).

Only doing soft-pinning is way worse for the overall performance, as
hard-pinning (according to my first tests).

But to do hard-pinning the correct way I need to know on which
NUMA-nodes the DomU runs...Otherwise performance will be impacted again.

As I cannot change on which NUMA-node the DomU is started (unless I
specify pCPUs to the DomU's config [which would require something
"intelligent" to figure out which Node/CPUs to know]), I have to do it
this way around, or am I getting it totally wrong?

In Xen the manual NUMA node preference is done through VCPU pinning, no matter how you do it, either soft or hard pinning, or even with a cpu pool. 

If all your VMs are already pinned to some certain nodes, maybe you can parse the configuration files or xenstore to retrive existing placement info (Yes it certainly involves some programming work) and find the best place for the next VM?  If you'd like to do a little more programming then take a look at xen/arch/x86/numa.c, I remember there's code retriving memory mapping info on each node.

> If you can't find useful info in Xenstore, then perhaps such feature you
> required is not yet available.

No, I did not find anything in xenstore.

> However, if you just want to know the memory usage on each node, perhaps
> you could try numactl and get some outputs? Or try libvirt? I remember
> numastat can give some intel about memory usage on each node.

As far as I understand numactl/numastat will not work in Dom0.

I remembered, as  Xen runs under Domain-0, numactl/numastat  will not work. But xl info can return some numa topology info, have you tried to get more outputs in the latest version?

Or maybe libvirt could do some help?

> Or, try combine NUMA support with vNUMA, perhaps you can get such info
> inside a VM.
> Best,
> Kun


> On Mon, Jan 9, 2017 at 5:43 PM Eike Waldt <waldt@xxxxxxxxxxxxx
> <mailto:waldt@xxxxxxxxxxxxx>> wrote:
>     On 01/04/2017 03:15 PM, Eike Waldt wrote:
>     > Hi Xen users,
>     >
>     > on [0] under #Querying Memory Distribution it says:
>     >
>     > "Up to Xen 4.4, there is no easy way to figure out how much memory
>     from
>     > each domain has been allocated on each NUMA node in the host."
>     >
>     > Is there a way in xen 4.7 ?
>     anybody?
>     >
>     > [0] https://wiki.xen.org/wiki/Xen_on_NUMA_Machines
>     >
>     >
>     >
>     > _______________________________________________
>     > Xen-users mailing list
>     > Xen-users@xxxxxxxxxxxxx <mailto:Xen-users@xxxxxxxxxxxxx>
>     > https://lists.xen.org/xen-users
>     >
>     --
>     Eike Waldt
>     Linux Consultant
>     Tel.: +49-175-7241189 <tel:+49%20175%207241189>
>     Mail: waldt@xxxxxxxxxxxxx <mailto:waldt@xxxxxxxxxxxxx>
>     B1 Systems GmbH
>     Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
>     GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537
>     _______________________________________________
>     Xen-users mailing list
>     Xen-users@xxxxxxxxxxxxx <mailto:Xen-users@xxxxxxxxxxxxx>
>     https://lists.xen.org/xen-users
> --
> Regards,
> Kun Cheng

Eike Waldt
Linux Consultant
Tel.: +49-175-7241189
Mail: waldt@xxxxxxxxxxxxx

B1 Systems GmbH
Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537

Kun Cheng
Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.