[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Null scheduler and vwfi native problem



On Fri, 2021-01-29 at 09:18 +0100, Jürgen Groß wrote:
> On 29.01.21 09:08, Anders Törnqvist wrote:
> > > 
> > > So it using it has only downsides (and that's true in general, if
> > > you
> > > ask me, but particularly so if using NULL).
> > Thanks for the feedback.
> > I removed dom0_vcpus_pin. And, as you said, it seems to be
> > unrelated to 
> > the problem we're discussing. 
>
Right. Don't put it back, and stay away from it, if you accept an
advice. :-)

> > The system still behaves the same.
> > 
Yeah, that was expected.

> > When the dom0_vcpus_pin is removed. xl vcpu-list looks like this:
> > 
> > Name                                ID  VCPU   CPU State Time(s) 
> > Affinity (Hard / Soft)
> > Domain-0                             0     0    0   r--      29.4
> > all / all
> > Domain-0                             0     1    1   r--      28.7
> > all / all
> > Domain-0                             0     2    2   r--      28.7
> > all / all
> > Domain-0                             0     3    3   r--      28.6
> > all / all
> > Domain-0                             0     4    4   r--      28.6
> > all / all
> > mydomu                              1     0    5   r--      21.6 5
> > / all
> > 
Right, and it makes sense for it to look like this.

> >  From this listing (with "all" as hard affinity for dom0) one might
> > read 
> > it like dom0 is not pinned with hard affinity to any specific pCPUs
> > at 
> > all but mudomu is pinned to pCPU 5.
> > Will the dom0_max_vcpus=5 in this case guarantee that dom0 only
> > will run 
> > on pCPU 0-4 so that mydomu always will have pCPU 5 for itself only?
> 
> No.
>
Well, yes... if you use the NULL scheduler. Which is in use here. :-)

Basically, the NULL scheduler _always_ assign one and only one vCPU to
each pCPU. This happens at domain (well, at the vCPU) creation time.
And it _never_ move a vCPU away from the pCPU to which it has assigned
it.

And it also _never_ change this vCPU-->pCPU assignment/relationship,
unless some special event happens (such as, either the vCPU and/or the
pCPU goes offline, is removed from the cpupool, you change the affinity
[as I'll explain below], etc).

This is the NULL scheduler's mission and only job, so it does that by
default, _without_ any need for an affinity to be specified.

So, how can affinity be useful in the NULL scheduler? Well, it's useful
if you want to control and decide to what pCPU a certain vCPU should
go.

So, let's make an example. Let's say you are in this situation:

Name                                ID  VCPU   CPU State Time(s) Affinity (Hard 
/ Soft)
Domain-0                             0     0    0   r--     29.4   all / all
Domain-0                             0     1    1   r--     28.7   all / all
Domain-0                             0     2    2   r--     28.7   all / all
Domain-0                             0     3    3   r--     28.6   all / all
Domain-0                             0     4    4   r--     28.6   all / all

I.e., you have 6 CPUs, you have only dom0, dom0 has 5 vCPUs and you are
not using dom0_vcpus_pin.

The NULL scheduler has put d0v0 on pCPU 0. And d0v0 is the only vCPU
that can run on pCPU 0, despite its affinities being "all"... because
it's what the NULL scheduler does for you and it's the reason why one
uses it! :-)

Similarly, it has put d0v1 on pCPU 1, d0v2 on pCPU 2, d0v3 on pCPU 3
and d0v4 on pCPU 4. And the "exclusivity guarantee" exaplained above
for d0v0 and pCPU 0, applies to all these other vCPUs and pCPUs as
well.

With no affinity being specified, which vCPU is assigned to which pCPU
is entirely under the NULL scheduler control. It has its heuristics
inside, to try to do that in a smart way, but that's an
internal/implementation detail and is not relevant here.

If you now create a domU with 1 vCPU, that vCPU will be assigned to
pCPU 5.

Now, let's say that, for whatever reason, you absolutely want that d0v2
to run on pCPU 5, instead of being assigned and run on pCPU 2 (which is
what the NULL scheduler decided to pick for it). Well, what you do is
use xl, set the affinity of d0v2 to pCPU 5, and you will get something
like this as a result:

Name                                ID  VCPU   CPU State Time(s) Affinity (Hard 
/ Soft)
Domain-0                             0     0    0   r--     29.4   all / all
Domain-0                             0     1    1   r--     28.7   all / all
Domain-0                             0     2    5   r--     28.7     5 / all
Domain-0                             0     3    3   r--     28.6   all / all
Domain-0                             0     4    4   r--     28.6   all / all

So, affinity is indeed useful, even when using NULL, if you want to
diverge from the default behavior and enact a certain policy, maybe due
to the nature of your workload, the characteristics of your hardware,
or whatever.

It is not, however, necessary to set the affinity to:
 - have a vCPU to always stay on one --and always the same one too-- 
   pCPU;
 - avoid that any other vCPU would ever run on that pCPU.

That is guaranteed by the NULL scheduler itself. It just can't happen
that it behaves otherwise, because the whole point of doing it was to
make it simple (and fast :-)) *exactly* by avoiding to teach it how to
do such things. It can't do it, because the code for doing it is not
there... by design! :-D

And, BTW, if you now create a domU with 1 vCPU, that vCPU will be
assigned to pCPU  2.

> 
> What if I would like mydomu to be th only domain that uses pCPU 2?

Setup a cpupool with that pcpu assigned to it and put your domain into
that cpupool.

Yes, with any other scheduler that is not NULL, that's the proper way
of doing it.

Regards
-- 
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)

Attachment: signature.asc
Description: This is a digitally signed message part


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.