[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin



On Fri, 2018-04-13 at 09:03 +0000, George Dunlap wrote:
> > On Apr 12, 2018, at 6:25 PM, Dario Faggioli <dfaggioli@xxxxxxxx>
> > wrote:
> > 
> I think the bottom line is, for this test to be valid, then at this
> point test_bit(VPF_migrating) *must* imply !vcpu_on_runqueue(v), but
> at this point it doesn’t: If someone else has come by and cleared the
> bit, done migration, and woken it up, and then someone *else* set the
> bit again without taking it off the runqueue, it may still be on the
> runqueue.
> 
> My series which calls vcpu_sleep_nosync_locked() after setting
> VPF_migrating should help with this.
> 
Yes. In fact, Olaf, I still think that doing a run with George's RFC
applied, would be useful, if only as a data point.

> Or, alternately, instead of baking all this implicit  knowledge about
> credit into the scheduler, we should just implement
> credit_vcpu_migrate(), and have it remove it from one runqueue and
> put it on another.
> 
But it's not really "baking Credit implicit knowledge", IMO. It is that
we have an invariant which we are failing to enforce.

That's why your series goes in the right direction, because by calling
sleep() in the same critical section of where the bit is set, it
improves how we enforce the invariant.

Implementing a csched_vcpu_migrate(), looks to me like "relaxing" the
invariant, which is right the opposite direction. :-)

We may well decide to _get_rid_ of the invariant, but I'm not sure that
implementing csched_vcpu_migrate() would be all that this takes and, in
general, I don't think that something like this:
 - is an approapriate thing to do at this point of 4.11 cycle;
 - will be easy to backport (while, despite the look of it, 
   backporting patch 1 and 2 of your series might not be too terrible).

Regards,
Dario
-- 
<<This happens because I choose it to happen!>> (Raistlin Majere)
-----------------------------------------------------------------
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: signature.asc
Description: This is a digitally signed message part

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.