[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v2 42/62] sched/null: skip vCPUs on the waitqueue that are blocked



From: Roger Pau Monne <roger.pau@xxxxxxxxxx>

Avoid scheduling vCPUs that are down, there's no point in assigning
them to a pCPU because they are not going to run anyway.

Since down vCPUs are not assigned to pCPUs after this change, force a
rescheduling when a vCPU is brought up if it's on the waitqueue.  Also
when scheduling try to pick a vCPU from the runqueue if the pCPU is
running idle.

There's no current way to prevent a user from adding more vcpus to a
pool than there are pcpus (if nothing else, by creating a new VM in a
given pool), or from taking pcpus from a pool in which #vcpus >=
#pcpus.

The null scheduler deals with this by having a queue of "unassigned"
vcpus that are waiting for a free pcpu.  When a pcpu becomes
available, it will do the assignment.  When a pcpu that has a vcpu is
assigned is removed from the pool, that vcpu is assigned to a
different pcpu if one is available; if not, it is put on the list.

In the case of shim mode, this also seems to happen whenever curvcpus
< maxvcpus: The L1 hypervisor (shim) only sees curvcpus cpus on which
to schedule L2 vcpus, but the L2 guest has maxvcpus vcpus to schedule,
of which (maxvcpus-curvcpus) are  marked down.  In this case, it also
seems that the null scheduler sometimes schedules a down vcpu when
there are up vcpus on the list; meaning that the up vcpus are never
scheduled.

Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
---
Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
Cc: Dario Faggioli <raistlin@xxxxxxxx>
---
Changes since v1:
 - Force a rescheduling when a vCPU is brought up.
 - Try to pick a vCPU from the runqueue if running the idle vCPU.
 - Add George Dunlap description of the problem to the commit log.
---
 xen/common/sched_null.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/xen/common/sched_null.c b/xen/common/sched_null.c
index b4a24baf8e..bacfb31cb3 100644
--- a/xen/common/sched_null.c
+++ b/xen/common/sched_null.c
@@ -574,6 +574,8 @@ static void null_vcpu_wake(const struct scheduler *ops, 
struct vcpu *v)
     {
         /* Not exactly "on runq", but close enough for reusing the counter */
         SCHED_STAT_CRANK(vcpu_wake_onrunq);
+        /* Force a rescheduling in case some CPU is idle can pick this vCPU */
+        cpumask_raise_softirq(&cpu_online_map, SCHEDULE_SOFTIRQ);
         return;
     }
 
@@ -761,9 +763,10 @@ static struct task_slice null_schedule(const struct 
scheduler *ops,
     /*
      * We may be new in the cpupool, or just coming back online. In which
      * case, there may be vCPUs in the waitqueue that we can assign to us
-     * and run.
+     * and run. Also check whether this CPU is running idle, in which case try
+     * to pick a vCPU from the waitqueue.
      */
-    if ( unlikely(ret.task == NULL) )
+    if ( unlikely(ret.task == NULL || ret.task == idle_vcpu[cpu]) )
     {
         spin_lock(&prv->waitq_lock);
 
@@ -781,6 +784,10 @@ static struct task_slice null_schedule(const struct 
scheduler *ops,
         {
             list_for_each_entry( wvc, &prv->waitq, waitq_elem )
             {
+                if ( test_bit(_VPF_down, &wvc->vcpu->pause_flags) )
+                    /* Skip vCPUs that are down. */
+                    continue;
+
                 if ( bs == BALANCE_SOFT_AFFINITY &&
                      !has_soft_affinity(wvc->vcpu, 
wvc->vcpu->cpu_hard_affinity) )
                     continue;
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.