[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: Recent upgrade of 4.13 -> 4.14 issue
On Mon, 2020-10-26 at 15:30 +0100, Jürgen Groß wrote: > On 26.10.20 14:54, Andrew Cooper wrote: > > On 26/10/2020 13:37, Frédéric Pierret wrote: > > > > > > If anyone would have any idea of what's going on, that would be > > > very > > > appreciated. Thank you. > > > > Does booting Xen with `sched=credit` make a difference? > > Hmm, I think I have spotted a problem in credit2 which could explain > the > hang: > > csched2_unit_wake() will NOT put the sched unit on a runqueue in case > it > has CSFLAG_scheduled set. This bit will be reset only in > csched2_context_saved(). > Exactly, it does not put it back there. However, if it finds a vCPU with the CSFLAG_scheduled flag set, It should set CSFLAG_delayed_runq_add flag. Unless curr_on_cpu(cpu)==unit or unit_on_runq(svc)==true... which should not be the case. Or where you saying that we actually are in one of this situations? In fact... > So in case a vcpu (and its unit, of course) is blocked and there has > been no other vcpu active on its physical cpu but the idle vcpu, > there > will be no call of csched2_context_saved(). This will block the vcpu > to become active in theory for eternity, in case there is no need to > run another vcpu on the physical cpu. > ...I maybe am not seeing what exact situation and sequence of events you're exactly thinking to. What I see is this: [*] - vCPU V is running, i.e., CSFLAG_scheduled is set - vCPU V blocks - we enter schedule() - schedule calls do_schedule() --> csched2_schedule() - we pick idle, so CSFLAG_delayed_runq_add is set for V - schedule calls sched_context_switch() - sched_context_switch() calls context_switch() - context_switch() calls sched_context_switched() - sched_context_switched() calls: - vcpu_context_saved() - unit_context_saved() - unit_context_saved() calls sched_context_saved() --> csched2_context_saved() - csched2_context_saved(): - clears CSFLAG_scheduled - checks (and clear) CSFLAG_delayed_runq_add [*] this assumes granularity 1, i.e., no core-scheduling and no rendezvous. Or was core-scheduling actually enabled? And if CSFLAG_delayed_runq_add is set **and** the vCPU is runnable, the task is added back to the runqueue. So, even if we don't do the actual context switch (i.e., we don't call __context_switch() ) if the next vCPU that we pick when vCPU V blocks is the idle one, it looks to me that we go get to call csched2_context_saved(). And it also looks to me that, when we get to that, if the vCPU is runnable, even if it has the CSFLAG_scheduled still set, we do put it back to the runqueue. And if the vCPU blocked, but csched2_unit_wake() run while CSFLAG_scheduled was still set, it indeed should mean that the vCPU itself will be runnable again when we get to csched2_context_saved(). Or did you have something completely different in mind, and I'm missing it? Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere) Attachment:
signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |