Xen project Mailing List

Re: [Xen-devel] [PATCH V2 1/1] Improved RTDS scheduler

On Thu, 2015-12-31 at 05:20 -0500, Tianyang Chen wrote: >Â > @@ -147,6 +148,16 @@ static unsigned int nr_rt_ops; > Â * Global lock is referenced by schedule_data.schedule_lock from all > Â * physical cpus. It can be grabbed via vcpu_schedule_lock_irq() > Â */ > + > +/* dedicated timer for replenishment */ > +static struct timer repl_timer; > + So, there's always only one timer... Even if we have multiple cpupool with RTDS as their scheduler, they share the replenishment timer? I think it makes more sense to make this per-scheduler. > +/* controls when to first start the timer*/ > +static int timer_started; > + I don't like this, and I don't think we need it. In fact, you removed it yourself from v3, AFAICT. > @@ -635,6 +652,13 @@ rt_vcpu_insert(const struct scheduler *ops, > struct vcpu *vc) > Â > ÂÂÂÂÂ/* add rt_vcpu svc to scheduler-specific vcpu list of the dom */ > ÂÂÂÂÂlist_add_tail(&svc->sdom_elem, &svc->sdom->vcpu); > + > +ÂÂÂÂif(!timer_started) > +ÂÂÂÂ{ÂÂÂ > +ÂÂÂÂÂÂÂÂ/* the first vcpu starts the timer for the first time*/ > +ÂÂÂÂÂÂÂÂtimer_started = 1; > +ÂÂÂÂÂÂÂÂset_timer(&repl_timer,svc->cur_deadline); > +ÂÂÂÂ} > Â} > Â This also seems to be gone in v3, which is good. In fact, it uses timer_started, which I already said I didn't like. About the actual startup of the timer (no matter whether for first time or not). Here, you were doing it in _vcpu_insert() and not in _vcpu_wake(); in v3 you're doing it in _vcpu_wake() and not in _runq_insert()... Which one is the proper way? > @@ -792,44 +816,6 @@ __runq_pick(const struct scheduler *ops, const > cpumask_t *mask) > Â} > Â > Â/* > - * Update vcpu's budget and > - * sort runq by insert the modifed vcpu back to runq > - * lock is grabbed before calling this function > - */ > -static void > -__repl_update(const struct scheduler *ops, s_time_t now) > -{ > Please, allow me to say that seeing this function going away, fills my heart with pure joy!! :-D > @@ -889,7 +874,7 @@ rt_schedule(const struct scheduler *ops, s_time_t > now, bool_t tasklet_work_sched > ÂÂÂÂÂÂÂÂÂ} > ÂÂÂÂÂ} > Â > -ÂÂÂÂret.time = MIN(snext->budget, MAX_SCHEDULE); /* sched quantum */ > +ÂÂÂÂret.time = snext->budget; /* invoke the scheduler next time */ > ÂÂÂÂÂret.task = snext->vcpu; > Â This is ok as it is done in v3 (i.e., snext->budget if !idle, -1 if idle). > @@ -1074,14 +1055,7 @@ rt_vcpu_wake(const struct scheduler *ops, > struct vcpu *vc) > ÂÂÂÂÂ/* insert svc to runq/depletedq because svc is not in queue now > */ > ÂÂÂÂÂ__runq_insert(ops, svc); > Â > -ÂÂÂÂ__repl_update(ops, now); > - > -ÂÂÂÂASSERT(!list_empty(&prv->sdom)); > -ÂÂÂÂsdom = list_entry(prv->sdom.next, struct rt_dom, sdom_elem); > -ÂÂÂÂonline = cpupool_scheduler_cpumask(sdom->dom->cpupool); > -ÂÂÂÂsnext = __runq_pick(ops, online); /* pick snext from ALL valid > cpus */ > - > -ÂÂÂÂrunq_tickle(ops, snext); > +ÂÂÂÂrunq_tickle(ops, svc); > And this is another thing I especially like of this patch: it makes the wakeup path a lot simpler and a lot more similar to how it looks like in the other schedulers. Good job with this. :-) > @@ -1108,15 +1078,8 @@ rt_context_saved(const struct scheduler *ops, > struct vcpu *vc) > ÂÂÂÂÂif ( test_and_clear_bit(__RTDS_delayed_runq_add, &svc->flags) && > ÂÂÂÂÂÂÂÂÂÂlikely(vcpu_runnable(vc)) ) > ÂÂÂÂÂ{ > +ÂÂÂÂÂÂÂÂ/* only insert the pre-empted vcpu back*/ > ÂÂÂÂÂÂÂÂÂ__runq_insert(ops, svc); > -ÂÂÂÂÂÂÂÂ__repl_update(ops, NOW()); > - > -ÂÂÂÂÂÂÂÂASSERT(!list_empty(&prv->sdom)); > -ÂÂÂÂÂÂÂÂsdom = list_entry(prv->sdom.next, struct rt_dom, sdom_elem); > -ÂÂÂÂÂÂÂÂonline = cpupool_scheduler_cpumask(sdom->dom->cpupool); > -ÂÂÂÂÂÂÂÂsnext = __runq_pick(ops, online); /* pick snext from ALL > cpus */ > - > -ÂÂÂÂÂÂÂÂrunq_tickle(ops, snext); > ÂÂÂÂÂ} Mmm... I'll think about this more and let you know... But out of the top of my head, I think the tickling has to stay? You preempted a vcpu from the pcpu where it was running, maybe some other pcpu is either idle or running a vcpu with a later deadline, and should come and pick this one up? > @@ -1167,6 +1130,74 @@ rt_dom_cntl( > ÂÂÂÂÂreturn rc; > Â} > Â > +static void repl_handler(void *data){ > +ÂÂÂÂunsigned long flags; > +ÂÂÂÂs_time_t now = NOW(); > +ÂÂÂÂs_time_t min_repl = LONG_MAX; /* max time used in comparison*/ > +ÂÂÂÂstruct scheduler *ops = data;Â > +ÂÂÂÂstruct rt_private *prv = rt_priv(ops); > +ÂÂÂÂstruct list_head *runq = rt_runq(ops); > +ÂÂÂÂstruct list_head *depletedq = rt_depletedq(ops); > +ÂÂÂÂstruct list_head *iter; > +ÂÂÂÂstruct list_head *tmp; > +ÂÂÂÂstruct rt_vcpu *svc = NULL; > + > +ÂÂÂÂspin_lock_irqsave(&prv->lock,flags); > + > +ÂÂÂÂstop_timer(&repl_timer); > + > +ÂÂÂÂlist_for_each_safe(iter, tmp, runq) > +ÂÂÂÂ{ > So, I'm a bit lost here. Why does the _replenishment_ timer's handler scans the runqueue (and, in v3, the running-queue as well)? I'd expect the replenishment timer to do, ehm, replenishments... And I wouldn't expect a vcpu that is either in the runqueue or running to need a replenishment (and, in case it would, I don't think we should take care of that here). > +ÂÂÂÂÂÂÂÂsvc = __q_elem(iter); > +ÂÂÂÂÂÂÂÂif ( now < svc->cur_deadline ) > +ÂÂÂÂÂÂÂÂÂÂÂÂbreak; > +ÂÂÂÂÂÂÂÂrt_update_deadline(now, svc); > +ÂÂÂÂÂÂÂÂ/* scan the runq to find the min release timeÂ > +ÂÂÂÂÂÂÂÂÂ* this happens when vcpus on runq miss deadline > +ÂÂÂÂÂÂÂÂÂ*/ > This is exactly my point. It looks to me that you're trying to catch ready or running vcpus missing their deadline in here, in the replenishment timer. I don't think this is appropriate... It makes the logic of the timer handler a lot more complicated than it should be. Oh, and one thing: the use of the term "release time" is IMO a bit misleading. Release of what? Typically, the release time of an RT task (or job) is when the task (or job) is declared ready to run... But I don't think it's used like this in here. I propose to just get rid of it. > +ÂÂÂÂÂÂÂÂif( min_repl> svc->cur_deadline ) > +ÂÂÂÂÂÂÂÂ{ > +ÂÂÂÂÂÂÂÂÂÂÂÂmin_repl = svc->cur_deadline; > +ÂÂÂÂÂÂÂÂ} > +ÂÂÂÂÂÂÂÂ/* reinsert the vcpu if its deadline is updated */ > +ÂÂÂÂÂÂÂÂ__q_remove(svc); > +ÂÂÂÂÂÂÂÂ__runq_insert(ops, svc); > One more proof of what I was trying to say. Is it really this handler's job to --basically-- re-sort the runqueue? I don't think so. What is the specific situation that you are trying to handle like this? In fact, this is also why I'm not convinced of the fact that we need the additional queue for running vcpus. Later in the thread, Meng says:Â Â"the current running VCPUs on cores also need to replenish theirÂ Â budgets at the beginning of their next periods." And he makes the following example: Â"[a backlogged] VCPU has its period equal to its budget. Suppose this Â VCPU is the only VCPU on a 4 core machine. This VCPU should keep Â running on one core and never be put back to runq. In the current Â code, this VCPU won't have its budget replenished." But I don't think I understand. When a vcpu runs out of budget, either: Âa. it needs an immediate replenishment Âb. it needs to go to depletedq, and a replenishment event for itÂ Â Â programmed (which may or may not require re-arming theÂ Â Â replenishment timer) Meng's example falls in a., AFAICT, and we can just deal with that when we handle the 'budget exhausted' event (in rt_schedule(), in this case, I think). The case you refer to in the comment above ("when vcpus on runq miss deadline") can either fall in a. or in b., but in both cases it seems to me that you can handle it when it happens, instead than inside this timer handling routine. > + > +ÂÂÂÂ/* if timer was triggered but none of the vcpus > +ÂÂÂÂÂ* need to be refilled, set the timer to be theÂÂÂ > +ÂÂÂÂÂ* default period + now > +ÂÂÂÂÂ*/ > +ÂÂÂÂif(min_repl == LONG_MAX) > +ÂÂÂÂ{ > +ÂÂÂÂÂÂÂÂset_timer(&repl_timer, now + RTDS_DEFAULT_PERIOD); > I agree with Meng's point in this thread: this should not be necessary. If it is, it's most likely because of a bug or to something else. Let's figure out what it is, and fix it properly. (I see that in v3 this logic is gone, so hopefully you found and fixed the issue already.) Thanks and Regards, Dario --Â <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.