commit baccc50fcb7cec6f5ec84473dad28847b65b65e8 Author: Dario Faggioli Date: Fri May 28 17:12:48 2021 +0200 credit2: make sure we pick a runnable unit from the runq if there is one A !runnable unit (temporarily) present in the runq may cause us to stop scanning the runq itself too early. Of course, we don't run any non-runnable vCPUs, but we end the scan and we fallback to picking the idle unit. In other word, this prevent us to find there and pick the actual unit that we're meant to start running (which might be further ahead in the runq). Depending on the vCPU pinning configuration, this may lead to such unit to be stuck in the runq for long time, causing malfunctioning inside the guest. Fix this by checking runnable/non-runnable status up-front, in the runq scanning function. Reported-by: Michał Leszczyński Reported-by: Dion Kant Signed-off-by: Dario Faggioli Reviewed-by: George Dunlap diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c index d6ebd126de..d89e340905 100644 --- a/xen/common/sched_credit2.c +++ b/xen/common/sched_credit2.c @@ -3361,6 +3361,10 @@ runq_candidate(struct csched2_runqueue_data *rqd, (unsigned char *)&d); } + /* Skip non runnable vcpus that we (temporarily) have in the runq */ + if ( unlikely(!vcpu_runnable(svc->vcpu)) ) + continue; + /* Only consider vcpus that are allowed to run on this processor. */ if ( !cpumask_test_cpu(cpu, svc->vcpu->cpu_hard_affinity) ) continue;