|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Recent upgrade of 4.13 -> 4.14 issue
On Mon, 2020-10-26 at 15:30 +0100, Jürgen Groß wrote:
> On 26.10.20 14:54, Andrew Cooper wrote:
> > On 26/10/2020 13:37, Frédéric Pierret wrote:
> > >
> > > If anyone would have any idea of what's going on, that would be
> > > very
> > > appreciated. Thank you.
> >
> > Does booting Xen with `sched=credit` make a difference?
>
> Hmm, I think I have spotted a problem in credit2 which could explain
> the
> hang:
>
> csched2_unit_wake() will NOT put the sched unit on a runqueue in case
> it
> has CSFLAG_scheduled set. This bit will be reset only in
> csched2_context_saved().
>
Exactly, it does not put it back there. However, if it finds a vCPU
with the CSFLAG_scheduled flag set, It should set
CSFLAG_delayed_runq_add flag.
Unless curr_on_cpu(cpu)==unit or unit_on_runq(svc)==true... which
should not be the case. Or where you saying that we actually are in one
of this situations?
In fact...
> So in case a vcpu (and its unit, of course) is blocked and there has
> been no other vcpu active on its physical cpu but the idle vcpu,
> there
> will be no call of csched2_context_saved(). This will block the vcpu
> to become active in theory for eternity, in case there is no need to
> run another vcpu on the physical cpu.
>
...I maybe am not seeing what exact situation and sequence of events
you're exactly thinking to. What I see is this: [*]
- vCPU V is running, i.e., CSFLAG_scheduled is set
- vCPU V blocks
- we enter schedule()
- schedule calls do_schedule() --> csched2_schedule()
- we pick idle, so CSFLAG_delayed_runq_add is set for V
- schedule calls sched_context_switch()
- sched_context_switch() calls context_switch()
- context_switch() calls sched_context_switched()
- sched_context_switched() calls:
- vcpu_context_saved()
- unit_context_saved()
- unit_context_saved() calls sched_context_saved() -->
csched2_context_saved()
- csched2_context_saved():
- clears CSFLAG_scheduled
- checks (and clear) CSFLAG_delayed_runq_add
[*] this assumes granularity 1, i.e., no core-scheduling and no
rendezvous. Or was core-scheduling actually enabled?
And if CSFLAG_delayed_runq_add is set **and** the vCPU is runnable, the
task is added back to the runqueue.
So, even if we don't do the actual context switch (i.e., we don't call
__context_switch() ) if the next vCPU that we pick when vCPU V blocks
is the idle one, it looks to me that we go get to call
csched2_context_saved().
And it also looks to me that, when we get to that, if the vCPU is
runnable, even if it has the CSFLAG_scheduled still set, we do put it
back to the runqueue.
And if the vCPU blocked, but csched2_unit_wake() run while
CSFLAG_scheduled was still set, it indeed should mean that the vCPU
itself will be runnable again when we get to csched2_context_saved().
Or did you have something completely different in mind, and I'm missing
it?
Regards
--
Dario Faggioli, Ph.D
http://about.me/dario.faggioli
Virtualization Software Engineer
SUSE Labs, SUSE https://www.suse.com/
-------------------------------------------------------------------
<<This happens because _I_ choose it to happen!>> (Raistlin Majere)
Attachment:
signature.asc
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |