Xen project Mailing List

Re: [Xen-devel] [PATCH 1/2] sched: credit2: respect per-vcpu hard affinity

To: Dario Faggioli <dario.faggioli@xxxxxxxxxx>

From: Justin Weaver <jtweaver@xxxxxxxxxx>

Date: Sat, 31 Jan 2015 20:51:30 -1000

Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>, Henri Casanova <henric@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxx

Delivery-date: Sun, 01 Feb 2015 06:52:02 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Mon, Jan 19, 2015 at 9:21 PM, Justin Weaver <jtweaver@xxxxxxxxxx> wrote: > On Mon, Jan 12, 2015 at 8:05 AM, Dario Faggioli > <dario.faggioli@xxxxxxxxxx> wrote: >>> if ( __vcpu_on_runq(svc) ) >>> + on_runq = 1; >>> + >>> + /* If the runqs are different, move svc to trqd. */ >>> + if ( svc->rqd != trqd ) >>> { >>> - __runq_remove(svc); >>> - update_load(ops, svc->rqd, svc, -1, now); >>> - on_runq=1; >>> + if ( on_runq ) >>> + { >>> + __runq_remove(svc); >>> + update_load(ops, svc->rqd, svc, -1, now); >>> + } >>> + __runq_deassign(svc); >>> + __runq_assign(svc, trqd); >>> + if ( on_runq ) >>> + { >>> + update_load(ops, svc->rqd, svc, 1, now); >>> + runq_insert(ops, svc->vcpu->processor, svc); >>> + } >>> } >>> - __runq_deassign(svc); >>> - svc->vcpu->processor = cpumask_any(&trqd->active); >>> - __runq_assign(svc, trqd); >>> + >>> >> Mmm.. I do not like the way the code looks after this is applied. Before >> the patch, it was really straightforward and easy to understand. Now >> it's way more involved. Can you explain why this rework is necessary? >> For now do it here, then we'll see whether and how to put that into a >> doc comment. > > When I was testing, if I removed hard affinity from a vcpu's current > pcpu to another pcpu in the same run queue, the VM would stop > executing. I'll go back and look at this because I see what you wrote > below about wake being called by vcpu_migrate in schedule.c; it > shouldn't freeze on the old cpu, it should wake on the new cpu no > matter if the run queue changed or not. I'll address this again after > some testing. >>> @@ -1399,8 +1531,12 @@ csched2_vcpu_migrate( >>> >>> trqd = RQD(ops, new_cpu); >>> >>> - if ( trqd != svc->rqd ) >>> - migrate(ops, svc, trqd, NOW()); >>> + /* >>> + * Call migrate even if svc->rqd == trqd; there may have been an >>> + * affinity change that requires a call to runq_tickle for a new >>> + * processor within the same run queue. >>> + */ >>> + migrate(ops, svc, trqd, NOW()); >>> } >>> >> As said above, I don't think I see the reason for this. Affinity >> changes, e.g., due to calls to vcpu_set_affinity() in schedule.c, forces >> the vcpu through a sleep wakeup cycle (it calls vcpu_sleep_nosync() >> direcly, while vcpu_wake() is called inside vcpu_migrate()). >> >> So, looks like what you are after (i.e., runq_tickle being called) >> should happen already, isn't it? Are there other reasons you need it >> for? > > Like I said above, I will look at this again. My VMs were getting > stuck after certain hard affinity changes. I'll roll back some of > these changes and test it out again. I discovered that SCHED_OP(VCPU2OP(v), wake, v); in function vcpu_wake in schedule.c is not being called because v's pause flags has _VPF_blocked set. For example... I start a guest with one vcpu with hard affinity 8 - 15 and xl vcpu-list says it's running on pcpu 15 I run xl vcpu-pin 1 0 8 to change it to hard affinity only with pcpu 8 When it gets to vcpu_wake, it tests vcpu_runnable(v) which is false because _VPF_blocked is set, so it skips the call to SCHED_OP(VCPU2OP(v), wake, v); and so does not get a runq_tickle xl vcpu-list now shows --- for the state and I cannot console into it What I don't understand though is if I then enter xl vcpu-pin 1 0 15 it reports that _VPF_blocked is NOT set, vcpu_wake calls credit2's wake, it gets a runq_tickle and everything is fine again Why did the value of the _VPF_blocked flag change after I entered xl vcpu-pin the second time?? I dove deep in the code and could not figure it out. So that is why v1 of my patch worked because I let it run migrate during an affinity change even if the current and destination run queues were the same, so it would do the processor assignment and runq_tickle regardless. I think you'll have to tell me if that's a hack or a good solution! I greatly appreciate any feedback. Thank you, Justin _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.