[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 4/4] VMX: fixup PI descritpor when cpu is offline
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > Sent: Friday, May 27, 2016 10:57 PM > To: Wu, Feng <feng.wu@xxxxxxxxx> > Cc: andrew.cooper3@xxxxxxxxxx; dario.faggioli@xxxxxxxxxx; > george.dunlap@xxxxxxxxxxxxx; Tian, Kevin <kevin.tian@xxxxxxxxx>; xen- > devel@xxxxxxxxxxxxx; konrad.wilk@xxxxxxxxxx; keir@xxxxxxx > Subject: Re: [PATCH v2 4/4] VMX: fixup PI descritpor when cpu is offline > > >>> On 26.05.16 at 15:39, <feng.wu@xxxxxxxxx> wrote: > > @@ -102,9 +103,10 @@ void vmx_pi_per_cpu_init(unsigned int cpu) > > { > > INIT_LIST_HEAD(&per_cpu(vmx_pi_blocking, cpu).list); > > spin_lock_init(&per_cpu(vmx_pi_blocking, cpu).lock); > > + per_cpu(vmx_pi_blocking, cpu).down = 0; > > This seems pointless - per-CPU data starts out all zero (and there > are various places already which rely on that). > > > @@ -122,10 +124,25 @@ static void vmx_vcpu_block(struct vcpu *v) > > * new vCPU to the list. > > */ > > spin_unlock_irqrestore(&v->arch.hvm_vmx.pi_hotplug_lock, flags); > > - return; > > + return 1; > > } > > > > spin_lock(pi_blocking_list_lock); > > + if ( unlikely(per_cpu(vmx_pi_blocking, v->processor).down) ) > > Is this something that can actually happen? vmx_pi_desc_fixup() > runs in stop-machine context, i.e. no CPU can actively be here (or > anywhere near the arch_vcpu_block() call sites). This is related to scheduler, maybe Dario can give some input about this. Dario? > > > + { > > + /* > > + * We being here means that the v->processor is going away, and all > > + * the vcpus on its blocking list were removed from it. Hence we > > + * cannot add new vcpu to it. Besides that, we return -1 to > > + * prevent the vcpu from being blocked. This is needed because > > + * if the vCPU is continue to block and here we don't put it > > + * in a per-cpu blocking list, it might not be woken up by the > > + * notification event. > > + */ > > + spin_unlock(pi_blocking_list_lock); > > + spin_unlock_irqrestore(&v->arch.hvm_vmx.pi_hotplug_lock, flags); > > + return 0; > > The comment says you mean to return -1 here... > > > +void vmx_pi_desc_fixup(int cpu) > > unsigned int > > > +{ > > + unsigned int new_cpu, dest; > > + unsigned long flags; > > + struct arch_vmx_struct *vmx, *tmp; > > + spinlock_t *new_lock, *old_lock = &per_cpu(vmx_pi_blocking, cpu).lock; > > + struct list_head *blocked_vcpus = &per_cpu(vmx_pi_blocking, cpu).list; > > + > > + if ( !iommu_intpost ) > > + return; > > + > > + spin_lock_irqsave(old_lock, flags); > > + per_cpu(vmx_pi_blocking, cpu).down = 1; > > + > > + list_for_each_entry_safe(vmx, tmp, blocked_vcpus, pi_blocking.list) > > + { > > + /* > > + * We need to find an online cpu as the NDST of the PI descriptor, > > it > > + * doesn't matter whether it is within the cpupool of the domain or > > + * not. As long as it is online, the vCPU will be woken up once the > > + * notification event arrives. > > + */ > > + new_cpu = cpu; > > +restart: > > Labels indented by at least one blank please. Or even better, get > things done without goto. > > > + while ( 1 ) > > + { > > + new_cpu = (new_cpu + 1) % nr_cpu_ids; > > + if ( cpu_online(new_cpu) ) > > + break; > > + } > > Please don't open code things like cpumask_cycle(). But with the > restart logic likely unnecessary (see below), this would probably > better become cpumask_any() then. > > > + new_lock = &per_cpu(vmx_pi_blocking, cpu).lock; > > DYM new_cpu here? In fact with ... > > > + spin_lock(new_lock); > > ... this I can't see how you would have successfully tested this > new code path, as I can't see how this would end in other than > a deadlock (as you hold this very lock already). > > > + /* > > + * After acquiring the blocking list lock for the new cpu, we need > > + * to check whether new_cpu is still online. > > How could it have gone offline? And I think this also needs Dario's confirm, as he is the scheduler expert. > As mentioned, CPUs get brought > down in stop-machine context (and btw for the very reason of > avoiding complexity like this). > > > + * If '.down' is true, it mean 'new_cpu' is also going to be > > offline, > > + * so just go back to find another one, otherwise, there are two > > + * possibilities: > > + * case 1 - 'new_cpu' is online. > > + * case 2 - 'new_cpu' is about to be offline, but doesn't get to > > + * the point where '.down' is set. > > + * In either case above, we can just set 'new_cpu' to 'NDST' field. > > + * For case 2 the 'NDST' field will be set to another online cpu > > when > > + * we get to this function for 'new_cpu' some time later. > > + */ > > + if ( per_cpu(vmx_pi_blocking, cpu).down ) > > And again I suspect you mean new_cpu here. > > > --- a/xen/common/schedule.c > > +++ b/xen/common/schedule.c > > @@ -833,10 +833,8 @@ void vcpu_block(void) > > > > set_bit(_VPF_blocked, &v->pause_flags); > > > > - arch_vcpu_block(v); > > - > > /* Check for events /after/ blocking: avoids wakeup waiting race. */ > > - if ( local_events_need_delivery() ) > > + if ( arch_vcpu_block(v) || local_events_need_delivery() ) > > Here as well as below I'm getting the impression that you have things > backwards: vmx_vcpu_block() returns true for the two pre-existing > return paths (in which case you previously did not enter this if()'s > body), and false on the one new return path. Plus ... > > > --- a/xen/include/asm-x86/hvm/hvm.h > > +++ b/xen/include/asm-x86/hvm/hvm.h > > @@ -608,11 +608,13 @@ unsigned long hvm_cr4_guest_reserved_bits(const > struct vcpu *v, bool_t restore); > > * not been defined yet. > > */ > > #define arch_vcpu_block(v) ({ \ > > + bool_t rc = 0; \ > > struct vcpu *v_ = (v); \ > > struct domain *d_ = v_->domain; \ > > if ( has_hvm_container_domain(d_) && \ > > d_->arch.hvm_domain.vmx.vcpu_block ) \ > > - d_->arch.hvm_domain.vmx.vcpu_block(v_); \ > > + rc = d_->arch.hvm_domain.vmx.vcpu_block(v_); \ > > + rc; \ > > }) > > ... rc defaulting to zero here supports my suspicion of something > having got mixed up. Oh, yes, it should be like this: - if ( local_events_need_delivery() ) + if ( !arch_vcpu_block(v) || local_events_need_delivery() ) Thanks, Feng > > Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |