[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/arm: introduce vwfi parameter



Hi Dario,

On 20/02/2017 22:53, Dario Faggioli wrote:
On Mon, 2017-02-20 at 19:38 +0000, Julien Grall wrote:
On 20/02/17 19:20, Dario Faggioli wrote:
E.g., if vCPU x of domain A wants to go idle with a WFI/WFE, but
the
host is overbooked and currently really busy, Xen wants to run some
other vCPU (of either the same of another domain).

That's actually the whole point of virtualization, and the reason
why
overbooking an host with more vCPUs (from multiple guests) than it
has
pCPUs works at all. If we start letting guests put the host's pCPUs
to
sleep, not only the scheduler, but many things would break, IMO!

I am not speaking about general case but when you get 1 vCPU pinned
to 1
pCPU (I think this is Stefano use case). No other vCPU will run on
this
pCPU. So it would be fine to let the guest do the WFI.

Mmm... ok, yes, in that case, it may make sense and work, from a, let's
say, purely functional perspective. But still I struggle to place this
in a bigger picture.

For instance, as you say, executing a WFI from a guest directly on
hardware, only makes sense if we have 1:1 static pinning. Which means
it can't just be done by default, or with a boot parameter, because we
need to check and enforce that there's only 1:1 pinning around.

I agree it cannot be done by default. Similarly, the poll mode cannot be done by default in platform nor by domain because you need to know that all vCPUs will be in polling mode.

But as I said, if vCPUs are not pinned this patch as very little advantage because you may context switch between them when yielding.


Is it possible to decide whether to trap and emulate WFI, or just
execute it, online, and change such decision dynamically? And even if
yes, how would the whole thing work? When the direct execution is
enabled for a domain we automatically enforce 1:1 pinning for that
domain, and kick all the other domain out of its pcpus? What if they
have their own pinning, what if they also have 'direct WFI' behavior
enabled?

It can be changed online, the WFI/WFE trapping is per pCPU (see HCR_EL2.{TWE,TWI}


If it is not possible to change all this online and on a per-domain
basis, what do we do? When dooted with the 'direct WFI' flag, we only
accept 1:1 pinning? Who should enforce that, the setvcpuaffinity
hypercall?

These are just examples, my point being that in theory, if we consider
a very specific usecase or set of usecase, there's a lot we can do. But
when you say "why don't you let the guest directly execute WFI", in
response to a patch and a discussion like this, people may think that
you are actually proposing doing it as a solution, which is not
possible without figuring out all the open questions above (actually,
probably, more) and without introducing a lot of cross-subsystem
policing inside Xen, which is often something we don't want.

I made this response because the patch sent by Stefano as a very specific use case that can be solved the same way. Everyone here is suggesting polling but it has it is own disadvantage: power consumption.

Anyway, I still think in both case we are solving a specific problem without looking at what matters. I.e Why the scheduler takes so much time to block/unblock.


But, if you let me say this again, it looks to me we are trying to
solve too many problem all at once in this thread, should we try
slowing down/refocusing? :-)

If you run multiple vCPU in the same pCPU you would have a bigger
interrupt latency. And blocked the vCPU or yield will likely have
the
same number unless you know the interrupt will come right now.

Maybe. At least on x86, that would depend on the actual load. If all
your pCPUs are more than 100% loaded, yes. If the load is less than
that, you may still see improvements.

But in
that case, using WFI in the guest may not have been the right things
to do.

But if the guest is, let's say, Linux, does it use WFI or not? And is
it the right thing or not?

Again, the fact you're saying this probably means there's something I
am either missing or ignoring about ARM.

WFI/WFE is a way to be nice and save power. It is not mandatory to use them, a guest OS can perfectly decide that it does not need it.

I have heard use case where people wants to disable the scheduler
(e.g a
nop scheduler) because they know only 1 vCPU will ever run on the
pCPU.
This is exactly the use case I am thinking about.

Sure! Except that, in Xen, we don't know whether we have, and always
will, 1 vCPU ever run on each pCPU. Nor we have a way to enforce that,
neither in toolstack nor in the hypervisor. :-P

So, I'm not sure what we're talking about, but what I'm quite sure
is
that we don't want a guest to be able to decide when and until what
time/event, a pCPU goes idle.

Well, if the guest is not using the WFI/WFE at all you would need an
interrupt from the scheduler to get it running.

If the guest is not using WFI, it's busy looping, isn't it?

Yes, very similar to what we are implementing with the poll here.


So here it is similar,
the scheduler would have setup a timer and the processor will awake
when
receiving the timer interrupt to enter in the hypervisor.

So, yes in fine the guest will waste its slot.

Did I say it already that this concept of "slots" does not apply here?
:-D

Sorry forgot about this :/. I guess you use the term credit? If so, the guest will use its credit for nothing.

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.