Xen project Mailing List

Re: [Xen-devel] [RFC PATCH v2 1/2] xen: credit2: rb-tree for runqueues

From: Praveen Kumar <kpraveen.lkml@xxxxxxxxx>

Date: Wed, 23 Jan 2019 23:00:21 +0530

Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Wed, 23 Jan 2019 17:30:45 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Dario, Thanks for your comments. On Fri, Jan 18, 2019 at 8:38 PM Dario Faggioli <dfaggioli@xxxxxxxx> wrote: > > On Sun, 2018-12-23 at 19:51 +0530, Praveen Kumar wrote: > > diff --git a/xen/common/sched_credit2.c b/xen/common/sched_credit2.c > > index 623a325ceb..2463a25f87 100644 > > --- a/xen/common/sched_credit2.c > > +++ b/xen/common/sched_credit2.c > > @@ -471,7 +472,7 @@ custom_param("credit2_runqueue", > > parse_credit2_runqueue); > > struct csched2_runqueue_data { > > spinlock_t lock; /* Lock for this > > runqueue */ > > > > - struct list_head runq; /* Ordered list of runnable > > vms */ > > + struct rb_root runq; /* Runqueue is an > > rbtree */ > > > I wouldn't change the comment. It's useful to know that the idea is to > use this field to keep a list of runnable vcpus, and that we want it to > be ordered, which is what the comment currently says. > > It's pointless to state that we're using an rb-tree, because once can > tell that, by just looking at the data type. > > Actually, the comment says "Ordered list of runnable vms", while I > think it would be better if it said "Ordered list of runnable vcpus". > > Changing "vms" to "vcpus" does not belong in this patch, strictly > speaking, but if you want to do that, I could live with that. If you > don't want to do it, then don't, and just leave the comment alone. > > > @@ -604,6 +605,28 @@ static inline bool has_cap(const struct > > csched2_vcpu *svc) > > return svc->budget != STIME_MAX; > > } > > > > +static void rb_runq_insert(struct rb_root *root, > > + struct csched2_vcpu *svc, > > + int *pos) > > +{ > > + struct csched2_vcpu *entry = NULL; > > > I'd call this (something like) 'parent_svc'. > > It makes it easier to understand what's actually happening in the loop > below. > > > + struct rb_node **node = &root->rb_node; > > + struct rb_node *parent = NULL; > > + > > + while (*node) { > > + parent = *node; > > + entry = rb_entry(parent, struct csched2_vcpu, runq_elem); > > + if ( svc->credit < entry->credit ) > > + node = &parent->rb_left; > > + else > > + node = &parent->rb_right; > > + > > + (*pos)++; > > + } > > > @@ -1789,6 +1803,7 @@ static void park_vcpu(struct csched2_vcpu *svc) > > * In both cases, we also add it to the list of parked vCPUs of > > the domain. > > */ > > __set_bit(_VPF_parked, &v->pause_flags); > > + > > if ( vcpu_on_runq(svc) ) > > { > > runq_remove(svc); > > > Unrelated change. Don't do it. :-) > > > @@ -2087,6 +2102,8 @@ csched2_vcpu_sleep(const struct scheduler *ops, > > struct vcpu *vc) > > } > > else if ( vcpu_on_runq(svc) ) > > { > > + printk(XENLOG_WARNING "%s : %d : vcpu on runq rb !\n", > > __func__, __LINE__); > > + > > > So, this, and all the various other lines similar to this are/have been > useful for debugging, I guess? > > That is ok, but I don't think they should land upstream. If you think > they (or maybe some of them) should, then make it/them proper debugging > output. > > And, probably, do that in its own patch, as it would not be related to > what the data structure used for the runqueues is. > > But again, IMO, you can get rid of all of them. > > > ASSERT(svc->rqd == c2rqd(ops, vc->processor)); > > update_load(ops, svc->rqd, svc, -1, NOW()); > > runq_remove(svc); > > > @@ -2764,8 +2783,10 @@ csched2_vcpu_migrate( > > if ( unlikely(!cpumask_test_cpu(new_cpu, > > cpupool_domain_cpumask(d))) ) > > { > > ASSERT(system_state == SYS_STATE_suspend); > > + > > if ( vcpu_on_runq(svc) ) > > { > > > Stray new-line again. > > > @@ -3206,17 +3227,18 @@ csched2_runtime(const struct scheduler *ops, > > int cpu, > > * 2) If there's someone waiting whose credit is positive, > > * run until your credit ~= his. > > */ > > - if ( ! list_empty(runq) ) > > + if ( ! RB_EMPTY_ROOT(runq) ) > > { > > - struct csched2_vcpu *swait = runq_elem(runq->next); > > + // Find the left most element, which is the most probable > > candidate > > + struct rb_node *node = rb_last(runq); > > > Comment style. And I think I'd say "Check the rightmost element in the > tree, which is the one with the highest credits" > > > + struct csched2_vcpu *swait = runq_elem(node); > > > I think we can do: > > struct csched2_vcpu *swait = runq_elem(rb_last(runq)); > > Yeah, matter of taste, mostly. Still... > > Anyway, if you keep the code like this, no blanks in-between > definitions. And you instead want one between the last definition and > the if below. > > > if ( ! is_idle_vcpu(swait->vcpu) > > && swait->credit > 0 ) > > { > > rt_credit = snext->credit - swait->credit; > > } > > } > > - > > /* > > * The next guy on the runqueue may actually have a higher > > credit, > > > Spurious change. > > > * if we've tried to avoid migrating him from a different cpu. > > > @@ -3350,9 +3372,8 @@ runq_candidate(struct csched2_runqueue_data > > *rqd, > > snext = csched2_vcpu(idle_vcpu[cpu]); > > > > check_runq: > > - list_for_each_safe( iter, temp, &rqd->runq ) > > - { > > - struct csched2_vcpu * svc = list_entry(iter, struct > > csched2_vcpu, runq_elem); > > + for (iter = rb_last(&rqd->runq); iter != NULL; iter = > > rb_prev(iter)) { > > + struct csched2_vcpu * svc = rb_entry(iter, struct > > csched2_vcpu, runq_elem); > > > I was about to comment about the fact that doing a for() is not the > same as doing the list_for_each_safe() (because of the *_safe() part). > > However, looking better, I don't think I see why list_for_each_safe() > (I mean, as opposed to just list_for_each()) was used in the first > place. > > So, yes, I think this is actually ok. Anyone, if there's something I'm > missing, please, point it out. :-D > > > @@ -3762,8 +3784,8 @@ csched2_dump(const struct scheduler *ops) > > dump_pcpu(ops, j); > > > > printk("RUNQ:\n"); > > - list_for_each( iter, runq ) > > - { > > + > > + for (iter = rb_last(runq); iter != NULL; iter = > > rb_prev(iter)) { > > struct csched2_vcpu *svc = runq_elem(iter); > > > > if ( svc ) > > > Ok, this makes sense. Have you verified that the runqueue is printed in > credits order in the dump? > Yes, I have dumped this using 'xl debug-keys r' dmesg output : ... (XEN) sched_smt_power_savings: disabled (XEN) NOW=549728084734 (XEN) Online Cpus: 0-3 (XEN) Cpupool 0: (XEN) Cpus: 0-3 (XEN) Scheduler: SMP Credit Scheduler rev2 (credit2) (XEN) Active queues: 1 (XEN) default-weight = 256 (XEN) Runqueue 0: (XEN) ncpus = 4 (XEN) cpus = 0-3 (XEN) max_weight = 256 (XEN) pick_bias = 1 (XEN) instload = 1 (XEN) aveload = 10002 (~3%) (XEN) idlers: 7 (XEN) tickled: 0 (XEN) fully idle cores: 7 (XEN) Domain info: (XEN) Domain: 0 w 256 c 0 v 4 (XEN) 1: [0.0] flags=2 cpu=3 credit=9803083 [w=256] load=517 (~0%) (XEN) 2: [0.1] flags=0 cpu=1 credit=10404026 [w=256] load=239 (~0%) (XEN) 3: [0.2] flags=0 cpu=0 credit=10369899 [w=256] load=2193 (~0%) (XEN) 4: [0.3] flags=0 cpu=2 credit=10500000 [w=256] load=1354 (~0%) (XEN) Runqueue 0: (XEN) CPU[00] runq=0, sibling=1, core=f (XEN) CPU[01] runq=0, sibling=2, core=f (XEN) CPU[02] runq=0, sibling=4, core=f (XEN) CPU[03] runq=0, sibling=8, core=f (XEN) run: [0.0] flags=2 cpu=3 credit=9803083 [w=256] load=517 (~0%) (XEN) RUNQ: kern-xen-dev:~ # Will send the updated patch soon. Thanks. Regards, ~Praveen. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.