[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 1/4] xen: credit2: implement utilization cap
On 12/06/2017 14:19, Dario Faggioli wrote: Hey, "domain have credit" ? the statement is if the "vcpus of domain have credit".Budget is burned by the domain's vCPUs, in a similar way to how credits are. When a domain runs out of budget, its vCPUs can't run any longer.if the vcpus of a domain have credit and if budget has run out, will the vcpus won't be scheduled.Is this a question? Assuming it is, what do you mean with "domain have credit"? Domains always have credits, and they never run out of them. There's no such thing as a domain not being runnable because it does not have credits. About budget, a domain with <= 0 budget means all its vcpus are not runnable, and hence won't be scheduler, no matter their credits. You answered the question here. what I want to ask is that if the budget of the domain is replenished, but credit for the vcpus of that domain is not available, then what happens. I believe, vcpus won't be scheduled (even if they have budget_quota) till they get their credit replenished.@@ -92,6 +92,82 @@ */ /* + * Utilization cap: + * + * Setting an pCPU utilization cap for a domain means the following: + * + * - a domain can have a cap, expressed in terms of % of physical + * For implementing this, we use the following approach: + * + * - each domain is given a 'budget', an each domain has a timer, which + * replenishes the domain's budget periodically. The budget is the amount + * of time the vCPUs of the domain can use every 'period'; + * + * - the period is CSCHED2_BDGT_REPL_PERIOD, and is the same for all domains + * (but each domain has its own timer; so the all are periodic by the same + * period, but replenishment of the budgets of the various domains, at + * periods boundaries, are not synchronous); + * + * - when vCPUs run, they consume budget. When they don't run, they don't + * consume budget. If there is no budget left for the domain, no vCPU of + * that domain can run. If a vCPU tries to run and finds that there is no + * budget, it blocks. + * Budget never expires, so at whatever time a vCPU wants to run, it can + * check the domain's budget, and if there is some, it can use it. + * + * - budget is replenished to the top of the capacity for the domain once + * per period. Even if there was some leftover budget from previous period, + * though, the budget after a replenishment will always be at most equal + * to the total capacify of the domain ('tot_budget'); + *budget is replenished but credits not available ?Still not getting this. why you can't be sure. Scheduler know the domain budget, number of vcpus per domain and we can calculate the budget_quota and translate it into cpu slot duration. Similarly , the value of rate limit is also known. We can compare and give a warning to the user if the budget_quota is less than rate limit.budget is finished but not vcpu has not reached the rate limit boundary ?Budget takes precedence over ratelimiting. This is important to keep cap working "regularly", rather then in some kind of permanent "trying- to-keep-up-with-overruns-in-previous-periods" state. And, ideally, a vcpu cap and ratelimiting should be set in such a way that they don't step on each other toe (or do that only rarely). I can see about trying to print a warning when I detect potential tricky values (but it's not easy, considering budget is per-domain, so I can't be sure about how much each vcpu will actually get, and whether or not This is very important for the user to know, if wrongly chosen, it can adversely affect the system's performance with frequent context switches. (the problem we are aware of). that will reveal to be significantly less than ratelimiting the most of the times).+ * - when a budget replenishment occurs, if there are vCPUs that had been + * blocked because of lack of budget, they'll be unblocked, and they will + * (potentially) be able to run again. + * + * Finally, some even more implementation related detail: + * + * - budget is stored in a domain-wide pool. vCPUs of the domain that want + * to run go to such pool, and grub some. When they do so, the amount + * they grabbed is _immediately_ removed from the pool. This happens in + * vcpu_try_to_get_budget(); + * + * - when vCPUs stop running, if they've not consumed all the budget they + * took, the leftover is put back in the pool. This happens in + * vcpu_give_budget_back();200% budget, 4 vcpus to run on 4 pcpus each allowed only 50% of budget. This is a static allocation .Err... again, are you telling or asking? giving an example to prove its a static allocation. for eg. 2 vcpus running on 2 pvpus at 20% budgeted time, if vcpu3 wants to execute some cpu intensive task, then it won't be allowed to allowed to use more than 50% of the pcpus.With what parameters? You mean with these ones you cite here (i.e., a 200% budget)? If the VM has 200%, and vcpu1 and vcpu2 consumes 20%+20%=40%, there's 160% free for vcpu3 and vcpu4.I checked the implenation below and I believe we can allow for these type of dynamic budget_quota allocation per vcpu. Not for initial version, but certainly we can consider it for future versions.But... it's already totally dynamic. csched2_dom_cntl() { svc->budget_quota = max(sdom->tot_budget / sdom->nr_vcpus, CSCHED2_MIN_TIMER); } If domain->tot_budge = 200 nr_cpus is 4, then each cpu gets 50%.How this is dynamic allocation ? We are not considering vcpu utilization of other vcpus of domain before allocating budget_quota for some vcpu. Let me know if my understanding is wrong. Just shared a thought as I experienced the confusion while I was reading the code for the first time. If you don't agree, its fine.@@ -408,6 +505,10 @@ struct csched2_vcpu { unsigned int residual; int credit; + + s_time_t budget;it's confusing, please can we have different member names for budget in domain and vcpu structure.Mmm.. I don't think it is. It's "how much budget this _thing_ have", where "thing" can be the domain or a vcpu, and you can tell that by looking at the containing structure. Most of the time, that's rather explicit, the former being sdom->budget, the latter being svc->budget. What different names did you have in mind? The only alternative that I can come up with would be something like: struct csched2_dom { ... dom_budget; ... }; struct csched2_vcpu { ... vcpu_budget; ... }; Which I don't like (because of the repetition). @@ -1354,7 +1469,16 @@ static void reset_credit(const struct scheduler *ops, int cpu, s_time_t now, * that the credit it has spent so far get accounted. */ if ( svc->vcpu == curr_on_cpu(svc_cpu) ) + { burn_credits(rqd, svc, now); + /* + * And, similarly, in case it has run out of budget, as a + * consequence of this round of accounting, we also must inform + * its pCPU that it's time to park it, and pick up someone else. + */ + if ( unlikely(svc->budget <= 0) )use of unlikely here is not saving much of cpu cycles.Well, considering that not all domains will have a cap, and that we don't expect, even for the domains with caps, all their vcpus to exhaust their budget at every reset event, I think annotating this as an unlikely event makes sense. From what I understand, I considered it to be a very likely event. It's not a super big deal, though, and I can get rid of it, if people don't like/are not convinced about it. :-) yes, its fine, we can leave it for now. @@ -1410,27 +1534,35 @@ void burn_credits(struct + sdom->budget += svc->budget; + + if ( sdom->budget > 0 ) + { + svc->budget = sdom->budget;why are you assigning the remaining sdom->budge to only this svc. svc should be assigned a proportionate budget. Each vcpu is assigned a %age of the domain budget based on the cap and number of vcpus. There is difference in the code that's here and the code in branch git://xenbits.xen.org/people/dariof/xen.git (fetch) rel/sched/credti2-caps branch. Logic in the branch code looks fine where you are taking svc->budget_quota into considration..Yeah... maybe look at patch 3/4. :-P Yeah, got it in third patch. :) Yes, got your point, but then the call for vcpu_try_to_get_budet should move to the code block in runq_candidate that return scurr other wise the condition looks incomplete and makes the logic ambiguous.In runq candidate we have a code base /* * Return the current vcpu if it has executed for less than ratelimit. * Adjuststment for the selected vcpu's credit and decision * for how long it will run will be taken in csched2_runtime. * * Note that, if scurr is yielding, we don't let rate limiting kick in. * In fact, it may be the case that scurr is about to spin, and there's * no point forcing it to do so until rate limiting expires. */ if ( !yield && prv->ratelimit_us && !is_idle_vcpu(scurr->vcpu) && vcpu_runnable(scurr->vcpu) && (now - scurr->vcpu->runstate.state_entry_time) < MICROSECS(prv->ratelimit_us) ) In this codeblock we return scurr. Here there is no check for vcpu-budget.Even if the scurr vcpu has executed for less than rate limit and scurr is not yielding, we need to check for its budget before returning scurr.But we check vcpu_runnable(scurr). And we've already called, in csched2_schedule(), vcpu_try_to_get_budget(scurr). And if scurr could not get any budget, we called park_vcpu(scurr), which sets scurr up in such a way that vcpu_runnable(scurr) is false. We call runq_candidate to find the next runnable candidate. If we want to return scurr as the current runnable candidate then it should have gone through all the checks including budget_quota and all these checks should be at one place. Thanks and Regards, Dario Anshul _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |