Xen project Mailing List

Re: [Xen-devel] [PATCH v6][RFC]xen: sched: convert RTDS from time to event driven model

To: Tianyang Chen <tiche@xxxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

Date: Fri, 26 Feb 2016 00:31:18 +0100

Cc: george.dunlap@xxxxxxxxxx, Dagaen Golomb <dgolomb@xxxxxxxxxxxxxx>, Meng Xu <mengxu@xxxxxxxxxxxxx>

Delivery-date: Thu, 25 Feb 2016 23:31:50 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

Hey again, Thanks for turning up so quickly. We are getting closer and closer, although (of course :-)) I have some more comments. However, is there a particular reason why you are keeping the RFC tag? Until you do that, it's like saying that you are chasing feedback, but you do not think yourself the patch should be considered for being upstreamed. As far as my opinion goes, this patch is not ready to go in right now (as I said, I've got questions and comments), but its status is way past RFC. On Thu, 2016-02-25 at 15:05 -0500, Tianyang Chen wrote: > changes since v5: > removed unnecessary vcpu_on_replq() checks > deadline_queue_insert() returns a flag to indicate if it's > needed to re-program the timer > removed unnecessary timer checks > added helper functions to remove vcpus from queues > coding style > > Changes since v4: > removed unnecessary replenishment queue checks in vcpu_wake() > extended replq_remove() to all cases in vcpu_sleep() > used _deadline_queue_insert() helper function for both queues > _replq_insert() and _replq_remove() program timer internally > > Changes since v3: > removed running queue. > added repl queue to keep track of repl events. > timer is now per scheduler. > timer is init on a valid cpu in a cpupool. > So, this does not belong here. It is ok to have it in this part of the email, but it should not land in the actual commit changelog, once the patch will be committed into Xen's git tree. The way to achieve the above is to put this summary of changes below the actual changelog, and below the Signed-of-by lines, after a marker that looks like this "---". > Budget replenishment and enforcement are separated by adding > a replenishment timer, which fires at the next most imminent > release time of all runnable vcpus. > > A new runningq has been added to keep track of all vcpus that > are on pcpus. > Mmm.. Is this the proper changelog? runningq is something we discussed, and that appeared in v2, but is certainly no longer here... :-O > diff --git a/xen/common/sched_rt.c b/xen/common/sched_rt.c > index 2e5430f..16f77f9 100644 > --- a/xen/common/sched_rt.c > +++ b/xen/common/sched_rt.c > > @@ -213,8 +220,14 @@ static inline struct list_head > *rt_depletedq(const struct scheduler *ops) > return &rt_priv(ops)->depletedq; > } > > +static inline struct list_head *rt_replq(const struct scheduler > *ops) > +{ > + return &rt_priv(ops)->replq; > +} > + > /* > - * Queue helper functions for runq and depletedq > + * Queue helper functions for runq, depletedq > + * and replenishment event queue > Full stop. :-) In any case, I'd change this in something like: "Helper functions for manipulating the runqueue, the depleted queue, and the replenishment events queue." > @@ -228,6 +241,18 @@ __q_elem(struct list_head *elem) > return list_entry(elem, struct rt_vcpu, q_elem); > } > > +static struct rt_vcpu * > +__replq_elem(struct list_head *elem) > +{ > + return list_entry(elem, struct rt_vcpu, replq_elem); > +} > + > +static int > +__vcpu_on_replq(const struct rt_vcpu *svc) > +{ > + return !list_empty(&svc->replq_elem); > +} > + > Ok, sorry for changing my mind again, but I really can't stand seeing these underscores. Please, rename these as replq_elem() and vcpu_on_replq(). There is nor sensible reason why we should prefix these with '__'. I know, that will create some amount of inconsistency, but: - there is inconsistency already (here and in other sched_* file) - not introducing more __ prefixed function is a step in the right direction; we'll convert the one that are already there with time. > + * Removing a vcpu from the replenishment queue could > + * re-program the timer for the next replenishment event > + * if there is any on the list > > + */ > +static inline void > +__replq_remove(const struct scheduler *ops, struct rt_vcpu *svc) > +{ > + struct rt_private *prv = rt_priv(ops); > + struct list_head *replq = rt_replq(ops); > + struct timer* repl_timer = prv->repl_timer; > + > + /* > + * Disarm the timer before re-programming it. > + * It doesn't matter if the vcpu to be removed > + * is on top of the list or not because the timer > + * is stopped and needs to be re-programmed anyways > + */ > + stop_timer(repl_timer); > + > + deadline_queue_remove(&svc->replq_elem); > + > + /* re-arm the timer for the next replenishment event */ > + if( !list_empty(replq) ) > + { > + struct rt_vcpu *svc_next = __replq_elem(replq->next); > + set_timer(repl_timer, svc_next->cur_deadline); > + } > Wait, maybe you misunderstood and/or I did not make myself clear enough (in which case, sorry). I never meant to say "always stop the timer". Atually, in one of my last email I said the opposite, and I think that would be the best thing to do. Do you think there is any specific reason why we need to always stop and restart it? If not, I think we can: - have deadline_queue_remove() also return whether the element removed was the first one (i.e., the one the timer was programmed after); - if it was, re-program the timer after the new front of the queue; - if it wasn't, nothing to do. Thoughts? > +/* > + * An utility function that inserts a vcpu to a > + * queue based on certain order (EDF). The returned > + * value is 1 if a vcpu has been inserted to the > + * front of a list > + */ > +static int > +deadline_queue_insert(struct rt_vcpu * (*_get_q_elem)(struct > list_head *elem), > + struct rt_vcpu *svc, struct list_head *elem, struct list_head > *queue) > +{ > + struct list_head *iter; > + int pos = 0; > + > + list_for_each(iter, queue) > + { > + struct rt_vcpu * iter_svc = (*_get_q_elem)(iter); > + if ( svc->cur_deadline <= iter_svc->cur_deadline ) > + break; > + > + pos++; > + } > + > + list_add_tail(elem, iter); > + return !pos; > } > Ok. > @@ -405,22 +500,38 @@ __runq_insert(const struct scheduler *ops, > struct rt_vcpu *svc) > > /* add svc to runq if svc still has budget */ > if ( svc->cur_budget > 0 ) > - { > - list_for_each(iter, runq) > - { > - struct rt_vcpu * iter_svc = __q_elem(iter); > - if ( svc->cur_deadline <= iter_svc->cur_deadline ) > - break; > - } > - list_add_tail(&svc->q_elem, iter); > - } > + deadline_queue_insert(&__q_elem, svc, &svc->q_elem, runq); > else > { > list_add(&svc->q_elem, &prv->depletedq); > + ASSERT( __vcpu_on_replq(svc) ); > So, by doing this, you're telling me that, if the vcpu is added to the depleted queue, there must be a replenishment planned for it (or the ASSERT() would fail). But if we are adding it to the runq, there may or may not be a replenishment planned for it. Is this what we want? Why there must not be a replenishment planned already for a vcpu going into the runq (to happen at its next deadline)? > /* > + * Insert svc into the replenishment event list > + * in replenishment time order. > + * vcpus that need to be replished earlier go first. > + * The timer may be re-programmed if svc is inserted > + * at the front of the event list. > + */ > +static void > +__replq_insert(const struct scheduler *ops, struct rt_vcpu *svc) > +{ > + struct list_head *replq = rt_replq(ops); > + struct rt_private *prv = rt_priv(ops); > + struct timer *repl_timer = prv->repl_timer; > + int set; > + > + ASSERT( !__vcpu_on_replq(svc) ); > + > + set = deadline_queue_insert(&__replq_elem, svc, &svc- > >replq_elem, replq); > + > + if( set ) > + set_timer(repl_timer,svc->cur_deadline); > +} A matter of taste, mostly, but I'd avoid the local variable (if this function will at some point become more complex, then we'll see, but for now, I think it's ok to just use the return value of deadline_queue_insert() inside the if(). > @@ -868,6 +968,8 @@ rt_schedule(const struct scheduler *ops, s_time_t > now, bool_t tasklet_work_sched > set_bit(__RTDS_delayed_runq_add, &scurr->flags); > > snext->last_start = now; > + > You don't need to add neither this blank line... > + ret.time = -1; /* if an idle vcpu is picked */ > if ( !is_idle_vcpu(snext->vcpu) ) > { > if ( snext != scurr ) > @@ -880,9 +982,11 @@ rt_schedule(const struct scheduler *ops, > s_time_t now, bool_t tasklet_work_sched > snext->vcpu->processor = cpu; > ret.migrated = 1; > } > + > + ret.time = snext->budget; /* invoke the scheduler next time > */ > + ...nor this one. > } > @@ -924,6 +1028,8 @@ rt_vcpu_sleep(const struct scheduler *ops, > struct vcpu *vc) > __q_remove(svc); > else if ( svc->flags & RTDS_delayed_runq_add ) > clear_bit(__RTDS_delayed_runq_add, &svc->flags); > + > + __replq_remove(ops, svc); > What I said in my last email is that you probably can get rid of this (see below, whe commenting on context_saved()). > @@ -1051,6 +1153,22 @@ rt_vcpu_wake(const struct scheduler *ops, > struct vcpu *vc) > else > SCHED_STAT_CRANK(vcpu_wake_not_runnable); > > + /* > + * If a deadline passed while svc was asleep/blocked, we need > new > + * scheduling parameters ( a new deadline and full budget), and > + * also a new replenishment event > + */ > + if ( now >= svc->cur_deadline) > + { > + rt_update_deadline(now, svc); > + __replq_remove(ops, svc); > + } > + > + if( !__vcpu_on_replq(svc) ) > + { > + __replq_insert(ops, svc); > + ASSERT( vcpu_runnable(vc) ); > Mmm... What's this assert about? > @@ -1087,10 +1193,6 @@ static void > rt_context_saved(const struct scheduler *ops, struct vcpu *vc) > { > struct rt_vcpu *svc = rt_vcpu(vc); > - struct rt_vcpu *snext = NULL; > - struct rt_dom *sdom = NULL; > - struct rt_private *prv = rt_priv(ops); > - cpumask_t *online; > spinlock_t *lock = vcpu_schedule_lock_irq(vc); > > clear_bit(__RTDS_scheduled, &svc->flags); > @@ -1102,14 +1204,7 @@ rt_context_saved(const struct scheduler *ops, > struct vcpu *vc) > likely(vcpu_runnable(vc)) ) > { > __runq_insert(ops, svc); > - __repl_update(ops, NOW()); > - > - ASSERT(!list_empty(&prv->sdom)); > - sdom = list_entry(prv->sdom.next, struct rt_dom, sdom_elem); > - online = cpupool_domain_cpumask(sdom->dom); > - snext = __runq_pick(ops, online); /* pick snext from ALL > cpus */ > - > - runq_tickle(ops, snext); > + runq_tickle(ops, svc); > } > So, here we are. What I meant was to make this function look more or less like this: static void rt_context_saved(const struct scheduler *ops, struct vcpu *vc) { struct rt_vcpu *svc = rt_vcpu(vc); spinlock_t *lock = vcpu_schedule_lock_irq(vc); clear_bit(__RTDS_scheduled, &svc->flags); /* not insert idle vcpu to runq */ if ( is_idle_vcpu(vc) ) goto out; if ( test_and_clear_bit(__RTDS_delayed_runq_add, &svc->flags) && likely(vcpu_runnable(vc)) ) { __runq_insert(ops, svc); runq_tickle(ops, snext); } else __replq_remove(ops, svc); out: vcpu_schedule_unlock_irq(lock, vc); } And, as said above, if you do this, try also removing the __replq_remove() call from rt_vcpu_sleep(), this one should cover for that (and, actually, more than just that!). After all this, check whether you still have the assert in __replq_insert() triggering and let me know Thanks and Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.