[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6][RFC]xen: sched: convert RTDS from time to event driven model



Hey again,

Thanks for turning up so quickly.

We are getting closer and closer, although (of course :-)) I have some
more comments.

However, is there a particular reason why you are keeping the RFC tag?
Until you do that, it's like saying that you are chasing feedback, but
you do not think yourself the patch should be considered for being
upstreamed. As far as my opinion goes, this patch is not ready to go in
right now (as I said, I've got questions and comments), but its status
is way past RFC.

On Thu, 2016-02-25 at 15:05 -0500, Tianyang Chen wrote:
> changes since v5:
>     removed unnecessary vcpu_on_replq() checks
>     deadline_queue_insert() returns a flag to indicate if it's
>     needed to re-program the timer
>     removed unnecessary timer checks
>     added helper functions to remove vcpus from queues
>     coding style
> 
> Changes since v4:
>     removed unnecessary replenishment queue checks in vcpu_wake()
>     extended replq_remove() to all cases in vcpu_sleep()
>     used _deadline_queue_insert() helper function for both queues
>     _replq_insert() and _replq_remove() program timer internally
> 
> Changes since v3:
>     removed running queue.
>     added repl queue to keep track of repl events.
>     timer is now per scheduler.
>     timer is init on a valid cpu in a cpupool.
> 
So, this does not belong here. It is ok to have it in this part of the
email, but it should not land in the actual commit changelog, once the
patch will be committed into Xen's git tree.

The way to achieve the above is to put this summary of changes below
the actual changelog, and below the Signed-of-by lines, after a marker
that looks like this "---".

> Budget replenishment and enforcement are separated by adding
> a replenishment timer, which fires at the next most imminent
> release time of all runnable vcpus.
> 
> A new runningq has been added to keep track of all vcpus that
> are on pcpus.
> 
Mmm.. Is this the proper changelog? runningq is something we discussed,
and that appeared in v2, but is certainly no longer here... :-O

> diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c
> index 2e5430f..16f77f9 100644
> --- a/xen/common/sched_rt.c
> +++ b/xen/common/sched_rt.c
> 
> @@ -213,8 +220,14 @@ static inline struct list_head
> *rt_depletedq(const struct scheduler *ops)
>      return &rt_priv(ops)->depletedq;
>  }
>  
> +static inline struct list_head *rt_replq(const struct scheduler
> *ops)
> +{
> +    return &rt_priv(ops)->replq;
> +}
> +
>  /*
> - * Queue helper functions for runq and depletedq
> + * Queue helper functions for runq, depletedq
> + * and replenishment event queue
>
Full stop. :-)

In any case, I'd change this in something like:
"Helper functions for manipulating the runqueue, the depleted queue,
and the replenishment events queue."

> @@ -228,6 +241,18 @@ __q_elem(struct list_head *elem)
>      return list_entry(elem, struct rt_vcpu, q_elem);
>  }
>  
> +static struct rt_vcpu *
> +__replq_elem(struct list_head *elem)
> +{
> +    return list_entry(elem, struct rt_vcpu, replq_elem);
> +}
> +
> +static int
> +__vcpu_on_replq(const struct rt_vcpu *svc)
> +{
> +   return !list_empty(&svc->replq_elem);
> +}
> +
>
Ok, sorry for changing my mind again, but I really can't stand seeing
these underscores. Please, rename these as replq_elem() and
vcpu_on_replq(). There is nor sensible reason why we should prefix
these with '__'.

I know, that will create some amount of inconsistency, but:
 - there is inconsistency already (here and in other sched_* file)
 - not introducing more __ prefixed function is a step in the right 
   direction; we'll convert the one that are already there with time.

> + * Removing a vcpu from the replenishment queue could
> + * re-program the timer for the next replenishment event
> + * if there is any on the list
> 
> + */
> +static inline void
> +__replq_remove(const struct scheduler *ops, struct rt_vcpu *svc)
> +{
> +    struct rt_private *prv = rt_priv(ops);
> +    struct list_head *replq = rt_replq(ops);
> +    struct timer* repl_timer = prv->repl_timer;
> +
> +    /*
> +     * Disarm the timer before re-programming it.
> +     * It doesn't matter if the vcpu to be removed
> +     * is on top of the list or not because the timer
> +     * is stopped and needs to be re-programmed anyways
> +     */
> +    stop_timer(repl_timer);
> +
> +    deadline_queue_remove(&svc->replq_elem);
> +
> +    /* re-arm the timer for the next replenishment event */
> +    if( !list_empty(replq) )
> +    {
> +        struct rt_vcpu *svc_next = __replq_elem(replq->next);
> +        set_timer(repl_timer, svc_next->cur_deadline);
> +    }
>
Wait, maybe you misunderstood and/or I did not make myself clear enough
(in which case, sorry). I never meant to say "always stop the timer".
Atually, in one of my last email I said the opposite, and I think that
would be the best thing to do.

Do you think there is any specific reason why we need to always stop
and restart it? If not, I think we can:
 - have deadline_queue_remove() also return whether the element 
   removed was the first one (i.e., the one the timer was programmed 
   after);
 - if it was, re-program the timer after the new front of the queue;
 - if it wasn't, nothing to do.

Thoughts?

> +/*
> + * An utility function that inserts a vcpu to a
> + * queue based on certain order (EDF). The returned
> + * value is 1 if a vcpu has been inserted to the
> + * front of a list
> + */
> +static int
> +deadline_queue_insert(struct rt_vcpu * (*_get_q_elem)(struct
> list_head *elem),
> +    struct rt_vcpu *svc, struct list_head *elem, struct list_head
> *queue)
> +{
> +    struct list_head *iter;
> +    int pos = 0;
> +
> +    list_for_each(iter, queue)
> +    {
> +        struct rt_vcpu * iter_svc = (*_get_q_elem)(iter);
> +        if ( svc->cur_deadline <= iter_svc->cur_deadline )
> +            break;
> +
> +        pos++;
> +    }
> +
> +    list_add_tail(elem, iter);
> +    return !pos;
>  }
> 
Ok.

> @@ -405,22 +500,38 @@ __runq_insert(const struct scheduler *ops,
> struct rt_vcpu *svc)
>  
>      /* add svc to runq if svc still has budget */
>      if ( svc->cur_budget > 0 )
> -    {
> -        list_for_each(iter, runq)
> -        {
> -            struct rt_vcpu * iter_svc = __q_elem(iter);
> -            if ( svc->cur_deadline <= iter_svc->cur_deadline )
> -                    break;
> -         }
> -        list_add_tail(&svc->q_elem, iter);
> -    }
> +        deadline_queue_insert(&__q_elem, svc, &svc->q_elem, runq);
>      else
>      {
>          list_add(&svc->q_elem, &prv->depletedq);
> +        ASSERT( __vcpu_on_replq(svc) );
>
So, by doing this, you're telling me that, if the vcpu is added to the
depleted queue, there must be a replenishment planned for it (or the
ASSERT() would fail).

But if we are adding it to the runq, there may or may not be a
replenishment planned for it.

Is this what we want? Why there must not be a replenishment planned
already for a vcpu going into the runq (to happen at its next
deadline)?

>  /*
> + * Insert svc into the replenishment event list
> + * in replenishment time order.
> + * vcpus that need to be replished earlier go first.
> + * The timer may be re-programmed if svc is inserted
> + * at the front of the event list.
> + */
> +static void
> +__replq_insert(const struct scheduler *ops, struct rt_vcpu *svc)
> +{
> +    struct list_head *replq = rt_replq(ops);
> +    struct rt_private *prv = rt_priv(ops);
> +    struct timer *repl_timer = prv->repl_timer;
> +    int set;
> +
> +    ASSERT( !__vcpu_on_replq(svc) );
> +
> +    set = deadline_queue_insert(&__replq_elem, svc, &svc-
> >replq_elem, replq);
>
> +    if( set )
> +        set_timer(repl_timer,svc->cur_deadline);
> +}
A matter of taste, mostly, but I'd avoid the local variable (if this
function will at some point become more complex, then we'll see, but
for now, I think it's ok to just use the return value of
deadline_queue_insert() inside the if().

> @@ -868,6 +968,8 @@ rt_schedule(const struct scheduler *ops, s_time_t
> now, bool_t tasklet_work_sched
>          set_bit(__RTDS_delayed_runq_add, &scurr->flags);
>  
>      snext->last_start = now;
> +
>
You don't need to add neither this blank line...

> +    ret.time =  -1; /* if an idle vcpu is picked */ 
>      if ( !is_idle_vcpu(snext->vcpu) )
>      {
>          if ( snext != scurr )
> @@ -880,9 +982,11 @@ rt_schedule(const struct scheduler *ops,
> s_time_t now, bool_t tasklet_work_sched
>              snext->vcpu->processor = cpu;
>              ret.migrated = 1;
>          }
> +
> +        ret.time = snext->budget; /* invoke the scheduler next time
> */
> +
...nor this one.
>      }

> @@ -924,6 +1028,8 @@ rt_vcpu_sleep(const struct scheduler *ops,
> struct vcpu *vc)
>          __q_remove(svc);
>      else if ( svc->flags & RTDS_delayed_runq_add )
>          clear_bit(__RTDS_delayed_runq_add, &svc->flags);
> +
> +    __replq_remove(ops, svc);
>
What I said in my last email is that you probably can get rid of this
(see below, whe commenting on context_saved()).
 
> @@ -1051,6 +1153,22 @@ rt_vcpu_wake(const struct scheduler *ops,
> struct vcpu *vc)
>      else
>          SCHED_STAT_CRANK(vcpu_wake_not_runnable);
>  
> +    /*
> +     * If a deadline passed while svc was asleep/blocked, we need
> new
> +     * scheduling parameters ( a new deadline and full budget), and
> +     * also a new replenishment event
> +     */
> +    if ( now >= svc->cur_deadline)
> +    {   
> +        rt_update_deadline(now, svc);
> +        __replq_remove(ops, svc);
> +    }
> +
> +    if( !__vcpu_on_replq(svc) )
> +    {
> +        __replq_insert(ops, svc);
> +        ASSERT( vcpu_runnable(vc) );
>
Mmm... What's this assert about?

> @@ -1087,10 +1193,6 @@ static void
>  rt_context_saved(const struct scheduler *ops, struct vcpu *vc)
>  {
>      struct rt_vcpu *svc = rt_vcpu(vc);
> -    struct rt_vcpu *snext = NULL;
> -    struct rt_dom *sdom = NULL;
> -    struct rt_private *prv = rt_priv(ops);
> -    cpumask_t *online;
>      spinlock_t *lock = vcpu_schedule_lock_irq(vc);
>  
>      clear_bit(__RTDS_scheduled, &svc->flags);
> @@ -1102,14 +1204,7 @@ rt_context_saved(const struct scheduler *ops,
> struct vcpu *vc)
>           likely(vcpu_runnable(vc)) )
>      {
>          __runq_insert(ops, svc);
> -        __repl_update(ops, NOW());
> -
> -        ASSERT(!list_empty(&prv->sdom));
> -        sdom = list_entry(prv->sdom.next, struct rt_dom, sdom_elem);
> -        online = cpupool_domain_cpumask(sdom->dom);
> -        snext = __runq_pick(ops, online); /* pick snext from ALL
> cpus */
> -
> -        runq_tickle(ops, snext);
> +        runq_tickle(ops, svc);
>      }
>
So, here we are.

What I meant was to make this function look more or less like this:

static void
rt_context_saved(const struct scheduler *ops, struct vcpu *vc)
{
    struct rt_vcpu *svc = rt_vcpu(vc);
    spinlock_t *lock = vcpu_schedule_lock_irq(vc);

    clear_bit(__RTDS_scheduled, &svc->flags);
    /* not insert idle vcpu to runq */
    if ( is_idle_vcpu(vc) )
        goto out;

    if ( test_and_clear_bit(__RTDS_delayed_runq_add, &svc->flags) &&
         likely(vcpu_runnable(vc)) )
    {
        __runq_insert(ops, svc);
        runq_tickle(ops, snext);
    }
    else
        __replq_remove(ops, svc);
out:
    vcpu_schedule_unlock_irq(lock, vc);
}

And, as said above, if you do this, try also removing the
__replq_remove() call from rt_vcpu_sleep(), this one should cover for
that (and, actually, more than just that!).

After all this, check whether you still have the assert in
__replq_insert() triggering and let me know

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.