[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH v2 1/2] xen: credit2: rb-tree for runqueues



Hi Dario,
Thanks for your comments.

On Fri, Jan 18, 2019 at 8:38 PM Dario Faggioli <dfaggioli@xxxxxxxx> wrote:
>
> On Sun, 2018-12-23 at 19:51 +0530, Praveen Kumar wrote:
> > diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c
> > index 623a325ceb..2463a25f87 100644
> > --- a/xen/common/sched_credit2.c
> > +++ b/xen/common/sched_credit2.c
> > @@ -471,7 +472,7 @@ custom_param("credit2_runqueue",
> > parse_credit2_runqueue);
> >  struct csched2_runqueue_data {
> >      spinlock_t lock;           /* Lock for this
> > runqueue                     */
> >
> > -    struct list_head runq;     /* Ordered list of runnable
> > vms               */
> > +    struct rb_root runq;       /* Runqueue is an
> > rbtree                      */
> >
> I wouldn't change the comment. It's useful to know that the idea is to
> use this field to keep a list of runnable vcpus, and that we want it to
> be ordered, which is what the comment currently says.
>
> It's pointless to state that we're using an rb-tree, because once can
> tell that, by just looking at the data type.
>
> Actually, the comment says "Ordered list of runnable vms", while I
> think it would be better if it said "Ordered list of runnable vcpus".
>
> Changing "vms" to "vcpus" does not belong in this patch, strictly
> speaking, but if you want to do that, I could live with that. If you
> don't want to do it, then don't, and just leave the comment alone.
>
> > @@ -604,6 +605,28 @@ static inline bool has_cap(const struct
> > csched2_vcpu *svc)
> >      return svc->budget != STIME_MAX;
> >  }
> >
> > +static void rb_runq_insert(struct rb_root *root,
> > +                           struct csched2_vcpu *svc,
> > +                           int *pos)
> > +{
> > +    struct csched2_vcpu *entry = NULL;
> >
> I'd call this (something like) 'parent_svc'.
>
> It makes it easier to understand what's actually happening in the loop
> below.
>
> > +    struct rb_node **node = &root->rb_node;
> > +    struct rb_node *parent = NULL;
> > +
> > +    while (*node) {
> > +        parent = *node;
> > +        entry = rb_entry(parent, struct csched2_vcpu, runq_elem);
> > +        if ( svc->credit < entry->credit )
> > +            node = &parent->rb_left;
> > +        else
> > +            node = &parent->rb_right;
> > +
> > +        (*pos)++;
> > +    }
>
> > @@ -1789,6 +1803,7 @@ static void park_vcpu(struct csched2_vcpu *svc)
> >       * In both cases, we also add it to the list of parked vCPUs of
> > the domain.
> >       */
> >      __set_bit(_VPF_parked, &v->pause_flags);
> > +
> >      if ( vcpu_on_runq(svc) )
> >      {
> >          runq_remove(svc);
> >
> Unrelated change. Don't do it. :-)
>
> > @@ -2087,6 +2102,8 @@ csched2_vcpu_sleep(const struct scheduler *ops,
> > struct vcpu *vc)
> >      }
> >      else if ( vcpu_on_runq(svc) )
> >      {
> > +        printk(XENLOG_WARNING "%s : %d : vcpu on runq rb !\n",
> > __func__, __LINE__);
> > +
> >
> So, this, and all the various other lines similar to this are/have been
> useful for debugging, I guess?
>
> That is ok, but I don't think they should land upstream. If you think
> they (or maybe some of them) should, then make it/them proper debugging
> output.
>
> And, probably, do that in its own patch, as it would not be related to
> what the data structure used for the runqueues is.
>
> But again, IMO, you can get rid of all of them.
>
> >          ASSERT(svc->rqd == c2rqd(ops, vc->processor));
> >          update_load(ops, svc->rqd, svc, -1, NOW());
> >          runq_remove(svc);
>
> > @@ -2764,8 +2783,10 @@ csched2_vcpu_migrate(
> >      if ( unlikely(!cpumask_test_cpu(new_cpu,
> > cpupool_domain_cpumask(d))) )
> >      {
> >          ASSERT(system_state == SYS_STATE_suspend);
> > +
> >          if ( vcpu_on_runq(svc) )
> >          {
> >
> Stray new-line again.
>
> > @@ -3206,17 +3227,18 @@ csched2_runtime(const struct scheduler *ops,
> > int cpu,
> >       * 2) If there's someone waiting whose credit is positive,
> >       *    run until your credit ~= his.
> >       */
> > -    if ( ! list_empty(runq) )
> > +    if ( ! RB_EMPTY_ROOT(runq) )
> >      {
> > -        struct csched2_vcpu *swait = runq_elem(runq->next);
> > +        // Find the left most element, which is the most probable
> > candidate
> > +        struct rb_node *node = rb_last(runq);
> >
> Comment style. And I think I'd say "Check the rightmost element in the
> tree, which is the one with the highest credits"
>
> > +        struct csched2_vcpu *swait = runq_elem(node);
> >
> I think we can do:
>
>   struct csched2_vcpu *swait = runq_elem(rb_last(runq));
>
> Yeah, matter of taste, mostly. Still...
>
> Anyway, if you keep the code like this, no blanks in-between
> definitions. And you instead want one between the last definition and
> the if below.
>
> >          if ( ! is_idle_vcpu(swait->vcpu)
> >               && swait->credit > 0 )
> >          {
> >              rt_credit = snext->credit - swait->credit;
> >          }
> >      }
> > -
> >      /*
> >       * The next guy on the runqueue may actually have a higher
> > credit,
> >
> Spurious change.
>
> >       * if we've tried to avoid migrating him from a different cpu.
>
> > @@ -3350,9 +3372,8 @@ runq_candidate(struct csched2_runqueue_data
> > *rqd,
> >          snext = csched2_vcpu(idle_vcpu[cpu]);
> >
> >   check_runq:
> > -    list_for_each_safe( iter, temp, &rqd->runq )
> > -    {
> > -        struct csched2_vcpu * svc = list_entry(iter, struct
> > csched2_vcpu, runq_elem);
> > +    for (iter = rb_last(&rqd->runq); iter != NULL; iter =
> > rb_prev(iter)) {
> > +        struct csched2_vcpu * svc = rb_entry(iter, struct
> > csched2_vcpu, runq_elem);
> >
> I was about to comment about the fact that doing a for() is not the
> same as doing the list_for_each_safe() (because of the *_safe() part).
>
> However, looking better, I don't think I see why list_for_each_safe()
> (I mean, as opposed to just list_for_each()) was used in the first
> place.
>
> So, yes, I think this is actually ok. Anyone, if there's something I'm
> missing, please, point it out. :-D
>
> > @@ -3762,8 +3784,8 @@ csched2_dump(const struct scheduler *ops)
> >              dump_pcpu(ops, j);
> >
> >          printk("RUNQ:\n");
> > -        list_for_each( iter, runq )
> > -        {
> > +
> > +        for (iter = rb_last(runq); iter != NULL; iter =
> > rb_prev(iter)) {
> >              struct csched2_vcpu *svc = runq_elem(iter);
> >
> >              if ( svc )
> >
> Ok, this makes sense. Have you verified that the runqueue is printed in
> credits order in the dump?
>

Yes, I have dumped this using 'xl debug-keys r'

dmesg output :
...
(XEN) sched_smt_power_savings: disabled
(XEN) NOW=549728084734
(XEN) Online Cpus: 0-3
(XEN) Cpupool 0:
(XEN) Cpus: 0-3
(XEN) Scheduler: SMP Credit Scheduler rev2 (credit2)
(XEN) Active queues: 1
(XEN) default-weight     = 256
(XEN) Runqueue 0:
(XEN) ncpus              = 4
(XEN) cpus               = 0-3
(XEN) max_weight         = 256
(XEN) pick_bias          = 1
(XEN) instload           = 1
(XEN) aveload            = 10002 (~3%)
(XEN) idlers: 7
(XEN) tickled: 0
(XEN) fully idle cores: 7
(XEN) Domain info:
(XEN) Domain: 0 w 256 c 0 v 4
(XEN)   1: [0.0] flags=2 cpu=3 credit=9803083 [w=256] load=517 (~0%)
(XEN)   2: [0.1] flags=0 cpu=1 credit=10404026 [w=256] load=239 (~0%)
(XEN)   3: [0.2] flags=0 cpu=0 credit=10369899 [w=256] load=2193 (~0%)
(XEN)   4: [0.3] flags=0 cpu=2 credit=10500000 [w=256] load=1354 (~0%)
(XEN) Runqueue 0:
(XEN) CPU[00] runq=0, sibling=1, core=f
(XEN) CPU[01] runq=0, sibling=2, core=f
(XEN) CPU[02] runq=0, sibling=4, core=f
(XEN) CPU[03] runq=0, sibling=8, core=f
(XEN) run: [0.0] flags=2 cpu=3 credit=9803083 [w=256] load=517 (~0%)
(XEN) RUNQ:

kern-xen-dev:~ #

Will send the updated patch soon. Thanks.

Regards,

~Praveen.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.