[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] query memory allocation per NUMA node

To: Eike Waldt <waldt@xxxxxxxxxxxxx>, Kun Cheng <chengkunck@xxxxxxxxx>, <xen-users@xxxxxxxxxxxxx>
From: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
Date: Wed, 18 Jan 2017 19:25:54 +0100
Delivery-date: Wed, 18 Jan 2017 18:27:09 +0000
List-id: Xen user discussion <xen-users.lists.xen.org>

On Wed, 2017-01-18 at 17:36 +0100, Eike Waldt wrote:
> On 01/17/2017 12:23 AM, Dario Faggioli wrote:
> > That may well be. But it sounds strange. I'd be inclined to think
> > that
> > there is something else going on.. Or maybe I'm just not
> > understanding
> > what you mean with "pinning while NUMA nodes per DomU" (and that's
> > why
> > I'm asking for the commands output :-)).
> > 
> I simply mean that you always pin ALL DomU vCPUs to a whole NUMA node
> (or more) and not single vCPUs.
> 
Ok, I understand it now (and I also see it in the output you sent me).
Yes, this is usually what makes the most sense to do.

> One detail to mention would be, that we run all DomU filesystems on
> NFS
> storage mounted on the Dom0.
> Another interesting fact is, that (as said above) we're doing some
> fio
> write tests. These go to NFS filesystems and the write speed is about
> 1000 MB/s (8000 Mbit/s) in the hard-pinning scenario and only 100
> MB/s
> in the soft-pinning scenario.
> 
Mmm... ok, it's indeed interesting. But I can't really tell, out of the
top of my head, what kind of relationship/interaction this may have
with hard vs soft pinning.

> I'll send you some outputs.
> 
Thanks. Looking at it.

You really have a lot of domains! :-D

So, in the hard pinning case, you totally isolate dom0, and it
therefore makes sense that you see ~0% steal time from inside it.

In the soft pinning case, you actually don't isolate it. In fact,
although they'll try not to, the various DomU are allowed to run on
pCPUs 0-15, while, OTOH, dom0 is _not_allowed_ to run on 16-143.

That's a bit unfair, and I think justifies the (very!) high steal time.

A more fair comparison between hard and soft pinning may be, either:

1) use soft-affinity for dom0 too. I.e., as far as dom0 is concerned,
output of `xl vcpu-list' should look as follows:

Name                                ID  VCPU   CPU State   Time(s) Affinity 
(Hard / Soft)
Domain-0                             0     0    0   -b-     245.0  all / 0-15
Domain-0                             0     1    1   -b-      66.1  all / 0-15
Domain-0                             0     2    2   -b-     102.8  all / 0-15
Domain-0                             0     3    3   -b-      59.2  all / 0-15
Domain-0                             0     4    4   -b-     197.7  all / 0-15
Domain-0                             0     5    5   -b-      50.8  all / 0-15
Domain-0                             0     6    6   -b-      97.3  all / 0-15
Domain-0                             0     7    7   -b-      42.1  all / 0-15
Domain-0                             0     8    8   -b-      95.1  all / 0-15
Domain-0                             0     9    9   -b-      31.3  all / 0-15
Domain-0                             0    10   10   r--      96.4  all / 0-15
Domain-0                             0    11   11   -b-      33.0  all / 0-15
Domain-0                             0    12   12   r--     101.3  all / 0-15
Domain-0                             0    13   13   r--      30.1  all / 0-15
Domain-0                             0    14   14   -b-     100.9  all / 0-15
Domain-0                             0    15   15   -b-      39.4  all / 0-15

To achieve this, I think you should get rid of dom0_vcpus_pin, keep
dom0_max_vcpus=16 and add dom0_nodes=0,relaxed (or something like
that). This will probably set the vcpu-affinity of dom0 to 'all/0-35',
which you can change to 'all/0-15' after boot.

2) properly isolate dom0, even in the soft-affinity case. That would
mean keeping dom0 affinity as you already have it, but change **all**
the other domains' affinity from 'all/xx-yy' (where xx and yy vary from
domain to domain) to '16-143/xx-yy'.

Let me say again that I'm not at all saying that I'm sure that either 1
or 2 will certainly perform better than the hard pinning case. This is
impossible to tell without trying.

But, like this, it's a more fair --and hence more interesting--
comparison, and IMO it's worth a try.

Another thing, what Xen version is it that you're using again? I'm
asking because I fixed a bug in Credit1's soft-affinity logic, during
the Xen 4.8 development cycle (as in, you may be subject to it, if not
on 4.8).

Check that out here:
https://lists.xenproject.org/archives/html/xen-devel/2016-08/msg02184.html

(it's commit f83fc393b "xen: credit1: fix mask to be used for tickling
in Credit1") in Xen's git repo.)

Checking stable releases, I'm able to find it in Xen 4.7.1, and in
Xen 4.6.4, so these versions are also ok.

If you're not in either 4.8, 4.7.1 or 4.6.4, I'd recommend upgrading to
any of those, but I understand that is not always be super-
straightfowrard! :-P

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
https://lists.xen.org/xen-users

Follow-Ups:
- Re: [Xen-users] query memory allocation per NUMA node
  - From: Eike Waldt

References:
- [Xen-users] query memory allocation per NUMA node
  - From: Eike Waldt
- Re: [Xen-users] query memory allocation per NUMA node
  - From: Eike Waldt
- Re: [Xen-users] query memory allocation per NUMA node
  - From: Kun Cheng
- Re: [Xen-users] query memory allocation per NUMA node
  - From: Eike Waldt
- Re: [Xen-users] query memory allocation per NUMA node
  - From: Dario Faggioli
- Re: [Xen-users] query memory allocation per NUMA node
  - From: Eike Waldt
- Re: [Xen-users] query memory allocation per NUMA node
  - From: Dario Faggioli
- Re: [Xen-users] query memory allocation per NUMA node
  - From: Eike Waldt

Prev by Date: Re: [Xen-users] query memory allocation per NUMA node
Next by Date: [Xen-users] Xen4.4 Debian8.7 UEFI reboot/halt problems on Dell PE T430
Previous by thread: Re: [Xen-users] query memory allocation per NUMA node
Next by thread: Re: [Xen-users] query memory allocation per NUMA node
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.