[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 1/2] xen: credit1: fix a race when picking initial pCPU for a vCPU



On Fri, 2016-08-19 at 13:23 +0100, George Dunlap wrote:
> On 18/08/16 11:00, Dario Faggioli wrote:
> > @@ -248,6 +245,33 @@ __runq_elem(struct list_head *elem)
> >      return list_entry(elem, struct csched_vcpu, runq_elem);
> >  }
> >  
> > +/* Is the first element of cpu's runq (if any) cpu's idle vcpu? */
> > +static inline bool_t is_runq_idle(unsigned int cpu)
> > +{
> > +    /*
> > +     * If we are on cpu, and we are peeking at our own runq while
> > cpu itself
> > +     * is not idle, that's fine even if we don't hold the runq
> > lock. In fact,
> > +     * the fact that there is a (non idle!) vcpu running means
> > that at least
> > +     * the idle vcpu is in the runq. And since only cpu itself
> > (via work
> > +     * stealing) can add stuff to the runq, and no other cpu will
> > ever steal
> > +     * our idle vcpu, that maks the runq manipulations done below
> > safe, even
> > +     * without locks.
> Thanks for investigating this and figuring out why the lockless
> access
> hasn't caused a problem before.  But relying on this behavior going
> forward doesn't really seem like a great idea if we can avoid it.
> 
I totally agree.

> We can't grab the pcpu scheduler lock in csched_tick(), or in the
> whole
> of csched_vcpu_acct() because we grab the private lock in
> __csched_vcpu_acct_start() (and that violates the locking
> order).  But
> is there a reason we can't grab the pcpu lock just around the call to
> _csched_cpu_pick?
> 
The first version of this patch, here in my stgit patchqueue, looked
exactly like that. ISTR I even tested it, and it works.

Then I thought that, since in this case it's all about making an
ASSERT() happy, it may be a good thing to avoid introducing more
contention. Also, I see your point on robustness/reliability. My view
is that locking on this path (if not on Credit1 in general) is already
so bad, that I don't think it's possible to make it any worse (and
hence wans't feeling guilty about taking going the way I did). :-)

*BUT* I don't have a too strong opinion, and if you prefer 'take lock'
approach, I'm fine with that.

I'll send v3.

Thanks and Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Senior Software Engineer, Citrix Systems R&D Ltd., Cambridge (UK)

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.