[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] query memory allocation per NUMA node
On 01/18/2017 07:25 PM, Dario Faggioli wrote: > On Wed, 2017-01-18 at 17:36 +0100, Eike Waldt wrote: >> On 01/17/2017 12:23 AM, Dario Faggioli wrote: >>> That may well be. But it sounds strange. I'd be inclined to think >>> that >>> there is something else going on.. Or maybe I'm just not >>> understanding >>> what you mean with "pinning while NUMA nodes per DomU" (and that's >>> why >>> I'm asking for the commands output :-)). >>> >> I simply mean that you always pin ALL DomU vCPUs to a whole NUMA node >> (or more) and not single vCPUs. >> > Ok, I understand it now (and I also see it in the output you sent me). > Yes, this is usually what makes the most sense to do. > >> One detail to mention would be, that we run all DomU filesystems on >> NFS >> storage mounted on the Dom0. >> Another interesting fact is, that (as said above) we're doing some >> fio >> write tests. These go to NFS filesystems and the write speed is about >> 1000 MB/s (8000 Mbit/s) in the hard-pinning scenario and only 100 >> MB/s >> in the soft-pinning scenario. >> > Mmm... ok, it's indeed interesting. But I can't really tell, out of the > top of my head, what kind of relationship/interaction this may have > with hard vs soft pinning. > >> I'll send you some outputs. >> > Thanks. Looking at it. > > You really have a lot of domains! :-D > > So, in the hard pinning case, you totally isolate dom0, and it > therefore makes sense that you see ~0% steal time from inside it. > > In the soft pinning case, you actually don't isolate it. In fact, > although they'll try not to, the various DomU are allowed to run on > pCPUs 0-15, while, OTOH, dom0 is _not_allowed_ to run on 16-143. > > That's a bit unfair, and I think justifies the (very!) high steal time. > > A more fair comparison between hard and soft pinning may be, either: > > 1) use soft-affinity for dom0 too. I.e., as far as dom0 is concerned, > output of `xl vcpu-list' should look as follows: > > Name ID VCPU CPU State Time(s) Affinity > (Hard / Soft) > Domain-0 0 0 0 -b- 245.0 all / 0-15 > Domain-0 0 1 1 -b- 66.1 all / 0-15 > Domain-0 0 2 2 -b- 102.8 all / 0-15 > Domain-0 0 3 3 -b- 59.2 all / 0-15 > Domain-0 0 4 4 -b- 197.7 all / 0-15 > Domain-0 0 5 5 -b- 50.8 all / 0-15 > Domain-0 0 6 6 -b- 97.3 all / 0-15 > Domain-0 0 7 7 -b- 42.1 all / 0-15 > Domain-0 0 8 8 -b- 95.1 all / 0-15 > Domain-0 0 9 9 -b- 31.3 all / 0-15 > Domain-0 0 10 10 r-- 96.4 all / 0-15 > Domain-0 0 11 11 -b- 33.0 all / 0-15 > Domain-0 0 12 12 r-- 101.3 all / 0-15 > Domain-0 0 13 13 r-- 30.1 all / 0-15 > Domain-0 0 14 14 -b- 100.9 all / 0-15 > Domain-0 0 15 15 -b- 39.4 all / 0-15 > > To achieve this, I think you should get rid of dom0_vcpus_pin, keep > dom0_max_vcpus=16 and add dom0_nodes=0,relaxed (or something like > that). This will probably set the vcpu-affinity of dom0 to 'all/0-35', > which you can change to 'all/0-15' after boot. I got rid of "dom0_vcpus_pin" and did some tests... all/0-15 or 0-15/all or all/all for Dom0 does not make a difference according to my tests in the soft-pinning case. I suppose that is because the CPUs 0-15 are assigned anyhow. The "dom0_nodes=0,relaxed"... Checked it out and it does exactly what you (and the manpage) said: relaxed --> all / 0-35 strict --> 0-35 / 0-35 Interesting is, that "xl debug-keys u; xl dmesg" still shows memory pages for NUMA Node3 even though it says in the manpage "dom0_nodes [..] Defaults for vCPU-s created and memory assigned to Dom0 [..]." There have to be enough free pages on Node0 (there is no other DomU running directly after startup). > > 2) properly isolate dom0, even in the soft-affinity case. That would > mean keeping dom0 affinity as you already have it, but change **all** > the other domains' affinity from 'all/xx-yy' (where xx and yy vary from > domain to domain) to '16-143/xx-yy'. That was a very good hint! I did not realize that before, thank you so much! The "issues" with stealing and bad NFS performance are gone now. > > Let me say again that I'm not at all saying that I'm sure that either 1 > or 2 will certainly perform better than the hard pinning case. This is > impossible to tell without trying. > > But, like this, it's a more fair --and hence more interesting-- > comparison, and IMO it's worth a try. > When I isolate the Dom0 properly in the soft-pinning scenario, compared to hard-pinning everything, I could not see any performance differences. But this is very hard to measure I think. > Another thing, what Xen version is it that you're using again? I'm > asking because I fixed a bug in Credit1's soft-affinity logic, during > the Xen 4.8 development cycle (as in, you may be subject to it, if not > on 4.8). > > Check that out here: > https://lists.xenproject.org/archives/html/xen-devel/2016-08/msg02184.html > > (it's commit f83fc393b "xen: credit1: fix mask to be used for tickling > in Credit1") in Xen's git repo.) > > Checking stable releases, I'm able to find it in Xen 4.7.1, and in > Xen 4.6.4, so these versions are also ok. > > If you're not in either 4.8, 4.7.1 or 4.6.4, I'd recommend upgrading to > any of those, but I understand that is not always be super- > straightfowrard! :-P As you may have noticed fomr "xl info" we have a SLES12-SP2 here. They call it "4.7.1_02-25". I just checked the sources and the fix seems to be included. > > Regards, > Dario > -- Eike Waldt Linux Consultant Tel.: +49-175-7241189 Mail: waldt@xxxxxxxxxxxxx B1 Systems GmbH Osterfeldstraße 7 / 85088 Vohburg / http://www.b1-systems.de GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537 Attachment:
signature.asc _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx https://lists.xen.org/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |