[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] scheduler independent forced vcpu selection



* Stephan Diestelhorst <sd386@xxxxxxxxxxxx> [2005-05-18 09:04]:
> The timer assertion might be the old scheduling timer, which gets
> probably reset, but not deleted beforehand... And the on runqueue
> assertion suggests that you are 'stealing' the domain from the
> schedulers queues without giving it a chance to notice.

Looking at both bvt and sedf, the runqueue is ordered by some metric or
another (evt, deadline respectively).  What I think we need is a way to
swap positions in the runqueues.  That is, if the lock holder is
runnable, I want the holder to run instead of current.  Is there some
way to do this in a scheduler independent manner with the current set of
scheduler ops defined in sched-if.h ?

I noticed that neither bvt or sedf implement the rem_task function which
I thought could be used to help out with the 'stealing' by notifying the
schedulers that prev was going away (removing it from the runqueue) but
just removing the exec_domain from the runqueue didn't help.

I'm including a patch that I'm currently using so you can get a better
idea of the modifications to schedule.c I'm making.

-- 
Ryan Harper
Software Engineer; Linux Technology Center
IBM Corp., Austin, Tx
(512) 838-9253   T/L: 678-9253
ryanh@xxxxxxxxxx


---
--- b/xen/common/schedule.c     2005-05-17 22:16:55.000000000 -0500
+++ c/xen/common/schedule.c     2005-05-18 12:42:44.765691872 -0500
@@ -273,6 +273,49 @@
     return 0;
 }
 
+/* Confer control to another vcpu */
+long do_confer(unsigned int vcpu, unsigned int yield_count)
+{
+    struct domain *d = current->domain;
+   
+    /* count hcalls */
+    current->confercnt++;
+
+    /* Validate CONFER prereqs:
+    * - vcpu is within bounds
+    * - vcpu is a valid in this domain
+    * - current has not already conferred its slice to vcpu
+    * - vcpu is not already running
+    * - designated vcpu's yield_count matches value from call
+    *
+    * of 1-4 are ok, then set conferred value and enter scheduler
+    */
+
+    if (vcpu > MAX_VIRT_CPUS)
+        return 0; 
+
+    if (d->exec_domain[vcpu] == NULL)
+        return 0;
+
+    if (current->conferred != VCPU_CANCONFER)
+        return 0;
+
+    /* even counts indicate a running vcpu, odd is preempted/conferred */
+    if ((d->exec_domain[vcpu]->vcpu_info->yield_count & 1) == 0)
+        return 0;
+
+    if (d->exec_domain[vcpu]->vcpu_info->yield_count != yield_count)
+        return 0;
+
+    /*
+     * set which vcpu should run in conferred state, request scheduling
+     */
+    current->conferred = (VCPU_CONFERRING|vcpu);
+    raise_softirq(SCHEDULE_SOFTIRQ);
+
+    return 0;
+}
+
 /*
  * Demultiplex scheduler-related hypercalls.
  */
@@ -412,8 +455,9 @@
  */
 static void __enter_scheduler(void)
 {
-    struct exec_domain *prev = current, *next = NULL;
+    struct exec_domain *prev = current, *next = NULL, *holder = NULL;
     int                 cpu = prev->processor;
+    unsigned int        holder_vcpu;
     s_time_t            now;
     struct task_slice   next_slice;
     s32                 r_time;     /* time for new dom to run */
@@ -436,12 +480,39 @@
 
     prev->cpu_time += now - prev->lastschd;
 
-    /* get policy-specific decision on scheduling... */
-    next_slice = ops.do_schedule(now);
+    /* get ed pointer to holder vcpu */
+    holder_vcpu = 0xffff & prev->conferred;
+    holder = prev->domain->exec_domain[holder_vcpu];
+
+    if (unlikely(prev->conferred & VCPU_CONFERRING) &&
+        domain_runnable(holder)) 
+    {
+        /* run holder next */
+        next = holder;
+
+        /* run for the remainder of prev's slice */
+        r_time = schedule_data[cpu].s_timer.expires - now;
+
+        /* increment confer counters */
+        prev->confer_out++;
+        next->confer_in++;
+
+        /* change prev's confer state to prevent re-entrance */
+        prev->conferred = VCPU_CONFERRED;
+
+    } else {      
+        /* get policy-specific decision on scheduling... */
+        next_slice = ops.do_schedule(now);
+
+        r_time = next_slice.time;
+        next = next_slice.task;
+    }
+
+    /* 
+     * always clear conferred state so this vcpu can confer during its slice
+     */
+    next->conferred = 0;
 
-    r_time = next_slice.time;
-    next = next_slice.task;
-    
     schedule_data[cpu].curr = next;
     
     next->lastschd = now;
@@ -455,6 +526,12 @@
 
     spin_unlock_irq(&schedule_data[cpu].schedule_lock);
 
+    /* bump vcpu yield_count when controlling domain is not-idle */
+    if ( !is_idle_task(prev->domain) )
+        prev->vcpu_info->yield_count++;
+    if ( !is_idle_task(next->domain) )
+        next->vcpu_info->yield_count++;
+
     if ( unlikely(prev == next) ) {
 #ifdef ADV_SCHED_HISTO
         adv_sched_hist_to_stop(cpu);

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.