[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/arm: introduce vwfi parameter



Hi Dario,

On 02/18/2017 01:47 AM, Dario Faggioli wrote:
On Fri, 2017-02-17 at 14:50 -0800, Stefano Stabellini wrote:
On Fri, 17 Feb 2017, Julien Grall wrote:
Please explain in which context this will be beneficial. My gut
feeling is
only will make performance worst if a multiple vCPU of the same
guest is
running on vCPU

I am not a scheduler expert, but I don't think so. Let me explain the
difference:

- vcpu_block blocks a vcpu until an event occurs, for example until
it
  receives an interrupt

- vcpu_yield stops the vcpu from running until the next scheduler
slot

So, what happens when you yield, depends on how yield is implemented in
the specific scheduler, and what other vcpus are runnable in the
system.

Currently, neither Credit1 nor Credit2 (and nor the Linux scheduler,
AFAICR) really stop the yielding vcpus. Broadly speaking, the following
two scenarios are possible:
 - vcpu A yields, and there is one or more runnable but not already
   running other vcpus. In this case, A is indeed descheduled and put
   back in a scheduler runqueue in such a way that one or more of the
   runnable but not running other vcpus have a chance to execute,
   before the scheduler would consider A again. This may be
   implemented by putting A on the tail of the runqueue, so all the
   other vcpus will get a chance to run (this is basically what
   happens in Credit1, modulo periodic runq sorting). Or it may be
   implemented by ignoring A for the next <number> scheduling
   decisions after it yielded (this is basically what happens in
   Credit2). Both approaches have pros and cons, but the common botton
   line is that others are given a chance to run.

 - vcpu A yields, and there are no runnable but not running vcpus
   around. In this case, A gets to run again. Full stop.

Which turn to be the busy looping I was mentioning when one vCPU is assigned to a pCPU. This is not the goal of WFI and I would be really surprised that embedded folks will be happy with a solution using more power.

And when a vcpu that has yielded is picked up back for execution
--either immediately or after a few others-- it can run again. And if
it yields again (and again, and again), we just go back to option 1 or
2 above.

In both cases the vcpus is not run until the next slot, so I don't
think
it should make the performance worse in multi-vcpus scenarios. But I
can
do some tests to double check.

All the above being said, I also don't think it will affect much multi-
vcpus VM's performance. In fact, even if the yielding vcpu is never
really stopped, the other ones are indeed given a chance to execute if
they want and are capable of.

But sure it would not harm verifying with some tests.

The main point of using wfi is for power saving. With this change,
you will
end up in a busy loop and as you said consume more power.

That's not true: the vcpu is still descheduled until the next slot.
There is no busy loop (that would be indeed very bad).

Well, as a matter of fact there may be busy-looping involved... But
isn't it the main point of this all. AFAIR, idle=pool in Linux does
very much the same, and has the same risk of potentially letting tasks
busy loop.

What will never happen is that a yielding vcpu, by busy looping,
prevents other runnable (and non yielding) vcpus to run. And if it
does, it's a bug. :-)

I didn't say it will prevent another vCPU to run. But it will at least use slot that could have been used for good purpose by another pCPU.

So in similar workload Xen will perform worst with vwfi=idle, not even mentioning the power consumption...


I don't think this is acceptable even to get a better interrupt
latency. Some
workload will care about interrupt latency and power.

I think a better approach would be to check whether the scheduler
has another
vCPU to run. If not wait for an interrupt in the trap.

This would save the context switch to the idle vCPU if we are still
on the
time slice of the vCPU.

From my limited understanding of how schedulers work, I think this
cannot work reliably. It is the scheduler that needs to tell the
arch-specific code to put a pcpu to sleep, not the other way around.

Yes, that is basically true.

Another way to explain it would be by saying that, if there were other
vCPUs to run, we wouldn't have gone idle (and entered the idle loop).

In fact, in work conserving schedulers, if pCPU x becomes idle, it
means there is _nothing_ that can execute on x itself around. And our
schedulers are (with the exception of ARRINC, and if not using caps in
Credit1) work conserving, or at least they want and try to be an as
much work conserving as possible.

My knowledge of the scheduler is limited. Does the scheduler take into account the cost of context switch when scheduling? When do you decide when to run the idle vCPU? Is it only the no other vCPU are runnable or do you have an heuristic?

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.