[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5][RFC]xen: sched: convert RTDS from time to event driven model





On 2/24/2016 9:02 PM, Dario Faggioli wrote:
Hey,

Here I am, sorry for the delay. :-(
No problem, I think we are almost there.

On Mon, 2016-02-08 at 23:33 -0500, Tianyang Chen wrote:
Changes since v4:
     removed unnecessary replenishment queue checks in vcpu_wake()
     extended replq_remove() to all cases in vcpu_sleep()
     used _deadline_queue_insert() helper function for both queues
     _replq_insert() and _replq_remove() program timer internally

Changes since v3:
     removed running queue.
     added repl queue to keep track of repl events.
     timer is now per scheduler.
     timer is init on a valid cpu in a cpupool.

Signed-off-by: Tianyang Chen <tiche@xxxxxxxxxxxxxx>
Signed-off-by: Meng Xu <mengxu@xxxxxxxxxxxxx>
Signed-off-by: Dagaen Golomb <dgolomb@xxxxxxxxxxxxxx>

So, the actual changelog... why did it disappear? :-)

I will add it in the next version.

diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
index 2e5430f..1f0bb7b 100644
--- a/xen/common/sched_rt.c
+++ b/xen/common/sched_rt.c
@@ -388,6 +424,66 @@ __q_remove(struct rt_vcpu *svc)
  }
/*
+ * Removing a vcpu from the replenishment queue could
+ * re-program the timer for the next replenishment event
+ * if the timer is currently active
+ */
+static inline void
+__replq_remove(const struct scheduler *ops, struct rt_vcpu *svc)
+{
+    struct rt_private *prv = rt_priv(ops);
+    struct list_head *replq = rt_replq(ops);
+    struct timer* repl_timer = prv->repl_timer;
+
+    if ( __vcpu_on_replq(svc) )
+    {
So, can this be false? If yes, when and why?

+        /*
+         * disarm the timer if removing the first replenishment
event
+         * which is going to happen next
+         */
+        if( active_timer(repl_timer) )
+        {

And here again, isn't it the case that, if there is a vcpu in the
replenishment queue, then the timer _must_ be active? (I.e., if the
above is true, isn't this also always true?)

So, the reason for this check is that the code inside timer handler also calls this function when timer is stopped. We don't want to start the timer inside the timer handler when it's manipulating the replenishment queue right? Because it could be removing/adding back more than one vcpu
and the timer should only be set at the very end of the timer handler.

In fact, I copied a static function from timer.c just to check if a timer is active or not. Not sure if
this is a good practice or not.

+            struct rt_vcpu *next_repl = __replq_elem(replq->next);
+
+            if( next_repl->cur_deadline == svc->cur_deadline )
+                repl_timer->expires = 0;
+
I think we need to call stop_timer() here, don't we? I know that
set_timer() does that itself, but I think it is indeed necessary. E.g.,
right now, you're manipulating the timer without the timer lock.

So when the timer handler is protected by a global lock, it is still necessary to stop the timer?
Also, the timer only runs on a single CPU for a scheduler too.

+            list_del_init(&svc->replq_elem);
+
+            /* re-arm the timer for the next replenishment event */
+            if( !list_empty(replq) )
+            {
+                struct rt_vcpu *svc_next = __replq_elem(replq-
next);
+                set_timer(repl_timer, svc_next->cur_deadline);
+            }
+        }
+
+        else
+            list_del_init(&svc->replq_elem);

I don't like the structure of this function, and I'd ask for a
different if/then/else logic to be used, but let's first see --by
answering the questions above-- if some of these if-s can actually go
away. :-)

 So this function reduces to:

/*
         * disarm the timer if removing the first replenishment event
         * which is going to happen next
         */
        if( active_timer(repl_timer) )
        {
            stop_timer(replq_timer);
            repl_timer->expires = 0;

            list_del_init(&svc->replq_elem);

            /* re-arm the timer for the next replenishment event */
            if( !list_empty(replq) )
            {
                struct rt_vcpu *svc_next = __replq_elem(replq->next);
                set_timer(repl_timer, svc_next->cur_deadline);
            }
        }

        else
            list_del_init(&svc->replq_elem);


+    }
+}
+
+/*
+ * An utility function that inserts a vcpu to a
+ * queue based on certain order (EDF)

End long comments with a full stop.

+ */
+static void
+_deadline_queue_insert(struct rt_vcpu * (*_get_q_elem)(struct
list_head *elem),

There's really a lot of "underscore-prefixing" in this file.

This is, of course, not your fault, and should not be addressed in this
patch. But, at least when it comes to new code, let's avoid making
things worse.

So, "deadline_queue_insert()" is just ok, I think. Yes, I know I did
suggest to call it how it's called in this patch, underscore included,
but I now think it would be better to get rid of that.
Sure.
+    struct rt_vcpu *svc, struct list_head *elem, struct list_head
*queue)
+{
+    struct list_head *iter;
+
+    list_for_each(iter, queue)
+    {
+        struct rt_vcpu * iter_svc = (*_get_q_elem)(iter);
+        if ( svc->cur_deadline <= iter_svc->cur_deadline )
+                break;

This looks too much indented.

@@ -405,22 +500,37 @@ __runq_insert(const struct scheduler *ops,
struct rt_vcpu *svc)
/* add svc to runq if svc still has budget */
      if ( svc->cur_budget > 0 )
-    {
-        list_for_each(iter, runq)
-        {
-            struct rt_vcpu * iter_svc = __q_elem(iter);
-            if ( svc->cur_deadline <= iter_svc->cur_deadline )
-                    break;
-         }
-        list_add_tail(&svc->q_elem, iter);
-    }
+        _deadline_queue_insert(&__q_elem, svc, &svc->q_elem, runq);
      else
-    {
          list_add(&svc->q_elem, &prv->depletedq);
-    }
Can we ASSERT() something about a replenishment event being queued, in
either case?
Do you mean asserting if the svc added is on queue after being inserted in either case?
As I said already, use full words in comments ("Insert svc in the
replenishment timer queue" or "Insert svc in the replenishment events
list").

+ * vcpus that needs to be repl earlier go first.

Ditto. And this can just be something like "in replenishment time
order", added to the sentence above.

+ * scheduler private lock serializes this operation

This is true for _everything_ in this scheduler right now, so there's
no real need to say it.

+ * it could re-program the timer if it fires later than
+ * this vcpu's cur_deadline.

Do you mean "The timer may be re-programmed if svc is inserted at the
front of the events list" ?

Also, this is used to program
+ * the timer for the first time.

"Also, this is what programs the timer the first time, when called from
xxx"
I agree with above and for this one, if svc is going to be in the front of the list, it programs the timer. Or in the other case It sees that repl_timer->expires == 0, which is either during system boot or one the only one vcpu on the replenishment list has been removed, in replq_remove(). So it's not called anywhere in such sense. Maybe "It programs the timer for the first timer, or when the only vcpu on replenishment event list has
been removed"?
+ */
+static void
+__replq_insert(const struct scheduler *ops, struct rt_vcpu *svc)
+{
+    struct list_head *replq = rt_replq(ops);
+    struct rt_private *prv = rt_priv(ops);
+    struct timer *repl_timer = prv->repl_timer;
+
+    ASSERT( !__vcpu_on_replq(svc) );
+
+    _deadline_queue_insert(&__replq_elem, svc, &svc->replq_elem,
replq);
+
+    if( repl_timer->expires == 0 ||
+        ( active_timer(repl_timer) && repl_timer->expires > svc-
cur_deadline ) )
+        set_timer(repl_timer,svc->cur_deadline);

Is it the case that you need to call set_timer() if (and only if) svc
has been inserted at the *front* of the replenishment queue? (If list
was empty, as in the case of first timer activation, it would be the
first and only.)

If that is the case, I'd just make deadline_queue_insert() return a
flag to that effect (e.g., it can return 1 if svc is the new first
element, 0 if it is not), and use the flag itself to guard the (re-)arm
of the timer... What do you think?
Yeah good call.
@@ -651,6 +785,10 @@ rt_vcpu_remove(const struct scheduler *ops,
struct vcpu *vc)
      lock = vcpu_schedule_lock_irq(vc);
      if ( __vcpu_on_q(svc) )
          __q_remove(svc);
+
+    if( __vcpu_on_replq(svc) )
+        __replq_remove(ops,svc);
+
As per the code in this patch looks like, you're
checking __vcpu_on_replq(svc) twice. In fact, you do it here and inside
__replq_remove(). I've already said that I'd like for the check inside
__rqplq_remove() to go away so, keep this one here (if it's really
necessary) and kill the other one inside the function.

@@ -924,6 +1027,9 @@ rt_vcpu_sleep(const struct scheduler *ops,
struct vcpu *vc)
          __q_remove(svc);
      else if ( svc->flags & RTDS_delayed_runq_add )
          clear_bit(__RTDS_delayed_runq_add, &svc->flags);
+
+    if( __vcpu_on_replq(svc) )
+        __replq_remove(ops, svc);

Same here.
So, what about __q_remove()? It looks like couple places are double checking as well.
@@ -1051,6 +1153,18 @@ rt_vcpu_wake(const struct scheduler *ops,
struct vcpu *vc)
      else
          SCHED_STAT_CRANK(vcpu_wake_not_runnable);
+ /* budget repl here is needed before inserting back to runq. If
so,

Comment style.

Also, what do you mean with that "If so"... "If so" what?
I mean if the budget is replenished here, then re-insert back. I guess it's self-explanatory.
+     * it should be re-inserted back to the replenishment queue.
+     */
+    if ( now >= svc->cur_deadline)
+    {
+        rt_update_deadline(now, svc);
+        __replq_remove(ops, svc);
+    }
+
+    if( !__vcpu_on_replq(svc) )
+        __replq_insert(ops, svc);
+
And here I am again: is it really necessary to check whether svc is not
in the replenishment queue? It looks to me that it really should not be
there... but maybe it can, because we remove the event from the queue
when the vcpu sleeps, but *not* when the vcpu blocks?
Yeah. That is the case where I keep getting assertion failure if it's removed. I'm thinking when a vcpu unblocks, it could potentially fall through here. And like you said, mostly spurious sleep happens when a vcpu is running and it could happen in other cases, although rare.

Thanks,
Tianyang


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.