[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

On Thu, 2018-04-12 at 14:45 +0200, Olaf Hering wrote:
> Am Thu, 12 Apr 2018 12:16:34 +0200
> schrieb Dario Faggioli <dfaggioli@xxxxxxxx>:
> > Olaf, new patch. Please, remove _everything_ and apply _only_ this
> > one.
> dies after the first iteration.
>         BUG_ON(!test_bit(_VPF_migrating, &prev->pause_flags));
So, VPF_migrating is set, when we enter the if() and decide to call
vcpu_sleep_nosync() and vcpu_migrate(), but is not set here, once we
have taken the lock.

Interestingly, we did not hit BUG_ON(vcpu_runnable(prev)), right before

Anyway, there is only once place where VPF_migrating is reset, and that
is in vcpu_migrate().

So, basing on our theory that we are running concurrently with
vcpu_set_affinity(), it's the call to vcpu_migrate() from
vcpu_set_affinity() that resets it.

I need to think a bit more (I'm trying to picture the exact scenario)
but as of now, it still does not make sense... As it looks to me that
now it is the call to vcpu_sleep_nosync(), also from
vcpu_set_affinity(), that should have removed prev from the runqueue.

True that vcpu_migrate() ends with vcpu_wake(), which put is back in a
runqueue, but then again the our vcpu_migrate(), here in
context_saved(), finding that VPF_migrate() is off, should *not* call

This is getting insane (or I am)... :-O

<<This happens because I choose it to happen!>> (Raistlin Majere)
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.