[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] Accurate vcpu weighting for credit scheduler
Hi, George Sorry for delaying. With this type of changes, The CPU% shows following. dom1 26 dom2 26 dom3 51 dom4 96 Thanks Atsushi SAKAI "George Dunlap" <George.Dunlap@xxxxxxxxxxxxx> wrote: > OK, I've grueled through an example by hand and think I see what's going on. > > So the idea of the credit scheduler is that we have a certain number > of "credits" per accounting period, and each of these credits > represents a certain amount of time. The scheduler gives out credits > according to weight, so theoretically each accounting period, if all > vcpus are active, each should consume all of its credits. Based on > that assumption, if a vcpu has run and accumulated more than one full > accounting period of credits, it's probably idle and we can leave it > be. > > The problem in this situation isnt' so much with rounding errors, as > with *scheduling granularity*. In the eample given: > > d1: weight 128 > d2: weight 128 > d3: weight 256 > d4: weight 512 > > If each domain has 2 vcpus, and there are 2 cores, then the credits > will be divided thus: > > d1: 37 credits / vcpu > d2: 37 credits / vcpu > d3: 75 credits / vcpu > d4: 150 credits / vcpu > > But since scheduling and accounting only happens every "tick", and > every "tick" is 100 credits. So each vcpu of d{1,2}, instead of > consuming 37 credits, consumes 100; same with each vcpu of d3. At > the end of the first accounting period, d{1,2,3} have gotten to run > for 100 credits worth of time, but d4 hasn't gotten to run at all. > > In short, the fact that we have a 100-credit scheduling granularity > breaks the assumption that every VM has had a chance to run each > accounting period when there are really long runqueues. > > I can think of a couple of solutions: the simplest one might be to > sort the runqueue by number of credits -- at least every accounting > period. In that case, d4 would always get to run every accounting > period; d{1.2} might not run for a given accounting period, but the > next time it would have twice the number of credits, &c. > > Others might include extending accounting periods when we have long > runqueues, or doing the credit limit during accounting only if it's > not on the runqueue (Sakai-san's idea) *combined* with a check when > the vcpu blocks. That would catch vcpus that are only moderately > active, but just happen to be on the runqueue for several accounting > periods in a row. > > Sakai-san, would you be willing to try to implement a simple "runqueue > sort" patch, and see if it also solves your scheduling issue? > > -George > > On Wed, Dec 10, 2008 at 2:45 AM, Atsushi SAKAI <sakaia@xxxxxxxxxxxxxx> wrote: > > Hi, Emmanuel > > > > 1)rounding error for credit > > > > This patch is over rounding error. > > So I think it does not need to consider this effect. > > If you think, would you suggest me your patch. > > It seems changing CSCHED_TICKS_PER_ACCT is not enough. > > > > 2)Effect for I/O intensive job. > > > > I am not change the code for BOOST priority. > > I just changes "credit reset" condition. > > It should be no effect on I/O intensive(but I am not measured it.) > > If it needs, I will test it. > > Which test is best for this change? > > (Simple I/O test is not enough for this case, > > I think complex domain I/O configuration is needed to prove this patch > > effect.) > > > > 3)vcpu allocation measurement. > > > > At first time, I use > > http://weather.ou.edu/~apw/projects/stress/ > > stress --cpu xx --timeout xx --verbose > > then I use simple test.(since 2vcpus on 1domain) > > yes > /dev/null & > > yes > /dev/null & > > Now I test with suggested method, then result is > > original w/ patch > > dom1 27 25 > > dom2 27 25 > > dom3 53 50 > > dom4 91 98 > > > > > > Thanks > > Atsushi SAKAI > > > > > > > > > > Emmanuel Ackaouy <ackaouy@xxxxxxxxx> wrote: > > > >> On Dec 9, 2008, at 2:25, George Dunlap wrote: > >> > On Tue, Dec 9, 2008 at 7:33 AM, Atsushi SAKAI > >> > <sakaia@xxxxxxxxxxxxxx> wrote: > >> >> You mean it should get rid of "credit reset"? > >> > > >> > Yes, that's exactly what I was thinking. Removing the check for vcpus > >> > on the runqueue may actually be functionally equivalent to removing > >> > the check altogether. > >> > >> Essentially, this code is there as a safeguard against rounding errors > >> and other oddball cases. In theory, a runnable VCPU should seldom > >> accumulate more than one time slice's worth of credits. > >> > >> The problem with your change is that a VCPU that is not a spinner > >> but instead runs and sleeps may not be removed from the accounting > >> list because when it should because it will not always be running when > >> accounting and the check in question is performed. Potentially this will > >> do very bad things for VCPUs that are I/O intensive or otherwise yield > >> or sleep for a short time before consuming a full time slice. > >> > >> One thing that may help here is to make the credit calculations less > >> prone to rounding errors. One thing I had wanted to do while at > >> XenSource but never got around to was to change the arithmetic > >> so that instead of 30 credits representing a time slice, we would > >> make this a much bigger number. > >> > >> In this case for example, you would get credit allocations that had > >> less significant rounding errors if you used 30000 instead of 30 > >> credits per time slice: > >> > >> dom1 vcpu0,1 w128 credit 3750 > >> dom2 vcpu0,1 w128 credit 3750 > >> dom3 vcpu0,1 w256 credit 7500 > >> dom4 vcpu0,1 w512 credit 15000 > >> > >> I suspect this would get rid of a large number of cases such as the > >> one you are reporting, where a runnable VCPU's credit exceeds > >> one entire time slice. This type of change would improve accuracy > >> and not screw up credit computation for I/O intensive and other > >> non spinning domains. > >> > >> What do you think? > >> > >> Also please confirm that your VCPUs are indeed doing simple > >> "while(1);" loops. > >> > >> Cheers, > >> Emmanuel. > > > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@xxxxxxxxxxxxxxxxxxx > > http://lists.xensource.com/xen-devel > > Attachment:
runq_sort_for_accurate_weight.patch _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |