[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Recent upgrade of 4.13 -> 4.14 issue

On 26.10.20 17:31, Dario Faggioli wrote:
On Mon, 2020-10-26 at 15:30 +0100, Jürgen Groß wrote:
On 26.10.20 14:54, Andrew Cooper wrote:
On 26/10/2020 13:37, Frédéric Pierret wrote:

If anyone would have any idea of what's going on, that would be
appreciated. Thank you.

Does booting Xen with `sched=credit` make a difference?

Hmm, I think I have spotted a problem in credit2 which could explain

csched2_unit_wake() will NOT put the sched unit on a runqueue in case
has CSFLAG_scheduled set. This bit will be reset only in

Exactly, it does not put it back there. However, if it finds a vCPU
with the CSFLAG_scheduled flag set, It should set
CSFLAG_delayed_runq_add flag.

Unless curr_on_cpu(cpu)==unit or unit_on_runq(svc)==true... which
should not be the case. Or where you saying that we actually are in one
of this situations?

In fact...

So in case a vcpu (and its unit, of course) is blocked and there has
been no other vcpu active on its physical cpu but the idle vcpu,
will be no call of csched2_context_saved(). This will block the vcpu
to become active in theory for eternity, in case there is no need to
run another vcpu on the physical cpu.

...I maybe am not seeing what exact situation and sequence of events
you're exactly thinking to. What I see is this: [*]

- vCPU V is running, i.e., CSFLAG_scheduled is set
- vCPU V blocks
- we enter schedule()
   - schedule calls do_schedule() --> csched2_schedule()
     - we pick idle, so CSFLAG_delayed_runq_add is set for V
   - schedule calls sched_context_switch()
     - sched_context_switch() calls context_switch()
       - context_switch() calls sched_context_switched()
         - sched_context_switched() calls:
           - vcpu_context_saved()
           - unit_context_saved()
             - unit_context_saved() calls sched_context_saved() -->
               - csched2_context_saved():
                 - clears CSFLAG_scheduled
                 - checks (and clear) CSFLAG_delayed_runq_add

[*] this assumes granularity 1, i.e., no core-scheduling and no
     rendezvous. Or was core-scheduling actually enabled?

And if CSFLAG_delayed_runq_add is set **and** the vCPU is runnable, the
task is added back to the runqueue.

So, even if we don't do the actual context switch (i.e., we don't call
__context_switch() ) if the next vCPU that we pick when vCPU V blocks
is the idle one, it looks to me that we go get to call

And it also looks to me that, when we get to that, if the vCPU is
runnable, even if it has the CSFLAG_scheduled still set, we do put it
back to the runqueue.

And if the vCPU blocked, but csched2_unit_wake() run while
CSFLAG_scheduled was still set, it indeed should mean that the vCPU
itself will be runnable again when we get to csched2_context_saved().

Or did you have something completely different in mind, and I'm missing

No, I think you are right. I mixed that up with __context_switch() not
being called.

Sorry for the noise,




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.