[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v1 3/5] xen: sched: null: deassign offline vcpus from pcpus
On Wed, 2019-07-17 at 17:04 +0100, George Dunlap wrote: > On 8/25/18 1:21 AM, Dario Faggioli wrote: > > If a pCPU has been/is being offlined, we don't want it to be > > neither > > assigned to any pCPU, nor in the wait list. > > > > Therefore, when we detect that a vcpu is going offline, remove it > > from > > both places. > > Hmm, this commit message wasn't very informative. > Ok, we can certainly improve that. :-) > It looks like what you really mean to do is: > > > @@ -518,6 +521,14 @@ static void null_vcpu_remove(const struct > > scheduler *ops, struct vcpu *v) > > > > lock = vcpu_schedule_lock_irq(v); > > > > + /* If offline, the vcpu shouldn't be assigned, nor in the > > waitqueue */ > > + if ( unlikely(!is_vcpu_online(v)) ) > > + { > > + ASSERT(per_cpu(npc, v->processor).vcpu != v); > > + ASSERT(list_empty(&nvc->waitq_elem)); > > + goto out; > > + } > > * Handle the case of an offline vcpu being removed (ASSERTing that > it's > neither on a processor nor on the waitqueue) > So, IIRC (sorry, it's been a while :-D ), this is for dealing with remove_vcpu() being called on a vcpu which is offline. So, yes, basically what you said. :-) Point is the work of removing such vCPU from any CPU and from the wait list has been done already, in null_vcpu_sleep(), while the vCPU was going offline. So, here, we only need to make sure that we don't do anything, i.e., that we don't call _vcpu_remove(). > But wait, isn't this fixing a important regression in patch 2? If > after > patch 2 but before patch 3, a VM is created with offline vcpus, and > then > destroyed, won't the offline vcpus reach here neither on the waitlist > nor on a vcpu? > I'm not sure I understand the point you're trying to make here, sorry. In general, considering what we've said in other mails, if you think that patch 2 and 3 should be merged, we can do that. My reasoning, when putting together the series, was the one I already stated: this is broken already, so no big deal breaking it "more", and I continue to see it that way. But I appreciate you seeing it differently, while I don't have a too strong opinion, so I'd be fine merging the patches (or doing other series rearrangements, if you feel strongly that they're necessary). Or is it something completely different that you meant? > > @@ -567,11 +578,31 @@ static void null_vcpu_wake(const struct > > scheduler *ops, struct vcpu *v) > > > > static void null_vcpu_sleep(const struct scheduler *ops, struct > > vcpu *v) > > { > > + struct null_private *prv = null_priv(ops); > > + unsigned int cpu = v->processor; > > + bool tickled = false; > > + > > ASSERT(!is_idle_vcpu(v)); > > > > + /* We need to special case the handling of the vcpu being > > offlined */ > > + if ( unlikely(!is_vcpu_online(v)) ) > > + { > > + struct null_vcpu *nvc = null_vcpu(v); > > + > > + printk("YYY d%dv%d going down?\n", v->domain->domain_id, > > v->vcpu_id); > > + if ( unlikely(!list_empty(&nvc->waitq_elem)) ) > > + { > > + spin_lock(&prv->waitq_lock); > > + list_del_init(&nvc->waitq_elem); > > + spin_unlock(&prv->waitq_lock); > > + } > > + else if ( per_cpu(npc, cpu).vcpu == v ) > > + tickled = vcpu_deassign(prv, v); > > + } > > * Handle the unexpected(?) case of a vcpu being put to sleep as(?) > it's > offlined > Well, that printk, really shouldn't be there! :-( > If it's not unexpected, then why the printk? > Because it was a debugging aid, for while I was working on the series, and I apparently failed at killing it before sending. Sorry. :-( > And if it is unexpected, what is the expected path for a cpu going > offline to be de-assigned from a vcpu (which is what the title seems > to > imply this patch is about)? > This is the vCPU going down, when do_vcpu_op(VCPU_down) is invoked on it, which then calls vcpu_sleep_nosync() with _VPF_down set in pause_flags (which means vcpu_is_online() would be false. So it is indeed the _expected_ path, and sorry again if the stray debugging printk made you think otherwise. > > @@ -615,12 +646,12 @@ static void null_vcpu_migrate(const struct > > scheduler *ops, struct vcpu *v, > > * > > * In the latter, there is just nothing to do. > > */ > > - if ( likely(list_empty(&nvc->waitq_elem)) ) > > + if ( likely(per_cpu(npc, v->processor).vcpu == v) ) > > { > > vcpu_deassign(prv, v); > > SCHED_STAT_CRANK(migrate_running); > > } > > - else > > + else if ( !list_empty(&nvc->waitq_elem) ) > > SCHED_STAT_CRANK(migrate_on_runq); > > * Teach null_vcpu_migrate() that !on_waitqueue != on_vcpu. > > It looks like the comment just above this hunk is now out of date: > > "v is either assigned to a pCPU, or in the waitqueue." > Yes it does. Thanks and Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere) Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |