[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 06/24] xen: credit2: implement yield()



On 20/09/16 14:25, George Dunlap wrote:
> On 17/08/16 18:18, Dario Faggioli wrote:
>> When a vcpu explicitly yields it is usually giving
>> us an advice of "let someone else run and come back
>> to me in a bit."
>>
>> Credit2 isn't, so far, doing anything when a vcpu
>> yields, which means an yield is basically a NOP (well,
>> actually, it's pure overhead, as it causes the scheduler
>> kick in, but the result is --at least 99% of the time--
>> that the very same vcpu that yielded continues to run).
>>
>> Implement a "preempt bias", to be applied to yielding
>> vcpus. Basically when evaluating what vcpu to run next,
>> if a vcpu that has just yielded is encountered, we give
>> it a credit penalty, and check whether there is anyone
>> else that would better take over the cpu (of course,
>> if there isn't the yielding vcpu will continue).
>>
>> The value of this bias can be configured with a boot
>> time parameter, and the default is set to 1 ms.
>>
>> Also, add an yield performance counter, and fix the
>> style of a couple of comments.
>>
>> Signed-off-by: Dario Faggioli <dario.faggioli@xxxxxxxxxx>
>> ---
>> Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
>> Cc: Anshul Makkar <anshul.makkar@xxxxxxxxxx>
>> Cc: Jan Beulich <jbeulich@xxxxxxxx>
>> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> ---
>> Note that this *only* consider the bias during the very scheduling decision
>> that retults from the vcpu calling yield. After that, the __CSFLAG_vcpu_yield
>> flag is reset, and during all furute scheduling decisions, the vcpu will
>> compete with the other ones with its own amount of credits.
>>
>> Alternatively, we can actually _subtract_ some credits to a yielding vcpu.
>> That will sort of make the effect of a call to yield last in time.
>>
>> I'm not sure which path is best. Personally, I like the subtract approach
>> (perhaps, with a smaller bias than 1ms), but I think the "one shot" behavior
>> implemented here is a good starting point. It is _something_, which is better
>> than nothing, which is what we have without this patch! :-) It's lightweight
>> (in its impact on the crediting algorithm, I mean), and benchmarks looks 
>> nice,
>> so I propose we go for this one, and explore the "permanent" --subtraction
>> based-- solution a bit more.
>> ---
>>  docs/misc/xen-command-line.markdown |   10 ++++++
>>  xen/common/sched_credit2.c          |   62 
>> +++++++++++++++++++++++++++++++----
>>  xen/common/schedule.c               |    2 +
>>  xen/include/xen/perfc_defn.h        |    1 +
>>  4 files changed, 68 insertions(+), 7 deletions(-)
>>
>> diff --git a/docs/misc/xen-command-line.markdown 
>> b/docs/misc/xen-command-line.markdown
>> index 3a250cb..5f469b1 100644
>> --- a/docs/misc/xen-command-line.markdown
>> +++ b/docs/misc/xen-command-line.markdown
>> @@ -1389,6 +1389,16 @@ Choose the default scheduler.
>>  ### sched\_credit2\_migrate\_resist
>>  > `= <integer>`
>>  
>> +### sched\_credit2\_yield\_bias
>> +> `= <integer>`
>> +
>> +> Default: `1000`
>> +
>> +Set how much a yielding vcpu will be penalized, in order to actually
>> +give a chance to run to some other vcpu. This is basically a bias, in
>> +favour of the non-yielding vcpus, expressed in microseconds (default
>> +is 1ms).
>> +
>>  ### sched\_credit\_tslice\_ms
>>  > `= <integer>`
>>  
>> diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
>> index a3d7beb..569174b 100644
>> --- a/xen/common/sched_credit2.c
>> +++ b/xen/common/sched_credit2.c
>> @@ -144,6 +144,9 @@
>>  #define CSCHED2_MIGRATE_RESIST       ((opt_migrate_resist)*MICROSECS(1))
>>  /* How much to "compensate" a vcpu for L2 migration */
>>  #define CSCHED2_MIGRATE_COMPENSATION MICROSECS(50)
>> +/* How big of a bias we should have against a yielding vcpu */
>> +#define CSCHED2_YIELD_BIAS           ((opt_yield_bias)*MICROSECS(1))
>> +#define CSCHED2_YIELD_BIAS_MIN       CSCHED2_MIN_TIMER
>>  /* Reset: Value below which credit will be reset. */
>>  #define CSCHED2_CREDIT_RESET         0
>>  /* Max timer: Maximum time a guest can be run for. */
>> @@ -181,11 +184,20 @@
>>   */
>>  #define __CSFLAG_runq_migrate_request 3
>>  #define CSFLAG_runq_migrate_request (1<<__CSFLAG_runq_migrate_request)
>> -
>> +/*
>> + * CSFLAG_vcpu_yield: this vcpu was running, and has called vcpu_yield(). 
>> The
>> + * scheduler is invoked to see if we can give the cpu to someone else, and
>> + * get back to the yielding vcpu in a while.
>> + */
>> +#define __CSFLAG_vcpu_yield 4
>> +#define CSFLAG_vcpu_yield (1<<__CSFLAG_vcpu_yield)
>>  
>>  static unsigned int __read_mostly opt_migrate_resist = 500;
>>  integer_param("sched_credit2_migrate_resist", opt_migrate_resist);
>>  
>> +static unsigned int __read_mostly opt_yield_bias = 1000;
>> +integer_param("sched_credit2_yield_bias", opt_yield_bias);
>> +
>>  /*
>>   * Useful macros
>>   */
>> @@ -1432,6 +1444,14 @@ out:
>>  }
>>  
>>  static void
>> +csched2_vcpu_yield(const struct scheduler *ops, struct vcpu *v)
>> +{
>> +    struct csched2_vcpu * const svc = CSCHED2_VCPU(v);
>> +
>> +    __set_bit(__CSFLAG_vcpu_yield, &svc->flags);
>> +}
>> +
>> +static void
>>  csched2_context_saved(const struct scheduler *ops, struct vcpu *vc)
>>  {
>>      struct csched2_vcpu * const svc = CSCHED2_VCPU(vc);
>> @@ -2247,10 +2267,22 @@ runq_candidate(struct csched2_runqueue_data *rqd,
>>      struct list_head *iter;
>>      struct csched2_vcpu *snext = NULL;
>>      struct csched2_private *prv = CSCHED2_PRIV(per_cpu(scheduler, cpu));
>> +    int yield_bias = 0;
>>  
>>      /* Default to current if runnable, idle otherwise */
>>      if ( vcpu_runnable(scurr->vcpu) )
>> +    {
>> +        /*
>> +         * The way we actually take yields into account is like this:
>> +         * if scurr is yielding, when comparing its credits with other
>> +         * vcpus in the runqueue, act like those other vcpus had yield_bias
>> +         * more credits.
>> +         */
>> +        if ( unlikely(scurr->flags & CSFLAG_vcpu_yield) )
>> +            yield_bias = CSCHED2_YIELD_BIAS;
>> +
>>          snext = scurr;
>> +    }
>>      else
>>          snext = CSCHED2_VCPU(idle_vcpu[cpu]);
>>  
>> @@ -2268,6 +2300,7 @@ runq_candidate(struct csched2_runqueue_data *rqd,
>>      list_for_each( iter, &rqd->runq )
>>      {
>>          struct csched2_vcpu * svc = list_entry(iter, struct csched2_vcpu, 
>> runq_elem);
>> +        int svc_credit = svc->credit + yield_bias;
>>  
>>          /* Only consider vcpus that are allowed to run on this processor. */
>>          if ( !cpumask_test_cpu(cpu, svc->vcpu->cpu_hard_affinity) )
>> @@ -2288,19 +2321,23 @@ runq_candidate(struct csched2_runqueue_data *rqd,
>>              continue;
>>          }
>>  
>> -        /* If this is on a different processor, don't pull it unless
>> -         * its credit is at least CSCHED2_MIGRATE_RESIST higher. */
>> +        /*
>> +         * If this is on a different processor, don't pull it unless
>> +         * its credit is at least CSCHED2_MIGRATE_RESIST higher.
>> +         */
>>          if ( svc->vcpu->processor != cpu
>> -             && snext->credit + CSCHED2_MIGRATE_RESIST > svc->credit )
>> +             && snext->credit + CSCHED2_MIGRATE_RESIST > svc_credit )
>>          {
>>              (*pos)++;
>>              SCHED_STAT_CRANK(migrate_resisted);
>>              continue;
>>          }
>>  
>> -        /* If the next one on the list has more credit than current
>> -         * (or idle, if current is not runnable), choose it. */
>> -        if ( svc->credit > snext->credit )
>> +        /*
>> +         * If the next one on the list has more credit than current
>> +         * (or idle, if current is not runnable), choose it.
>> +         */
>> +        if ( svc_credit > snext->credit )
>>              snext = svc;
> 
> Hmm, if we change snext, shouldn't we also zero out the yield bias?
> Otherwise vcpus competing with snext (which will at this point have had
> the YIELD flag cleared) will be given the yield bonus as well, which is
> not what we want.  In fact, I think in this case we'll always choose the
> *last* vcpu on the list unless there's one where the gap between N and
> N+1 is greater than YIELD_BIAS, won't it?

Oops -- just noticed the next line:

        /* In any case, if we got this far, break. */
        break;

Nevermind. :-)

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.