[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin

On Thu, 2018-04-12 at 15:15 +0200, Dario Faggioli wrote:
> On Thu, 2018-04-12 at 14:45 +0200, Olaf Hering wrote:
> > 
> > dies after the first iteration.
> > 
> >         BUG_ON(!test_bit(_VPF_migrating, &prev->pause_flags));
> > 
Update. I replaced this:

+        BUG_ON(vcpu_runnable(prev));
+        BUG_ON(!test_bit(_VPF_migrating, &prev->pause_flags));

with this, in the patch:

+        if (vcpu_runnable(prev) || !test_bit(_VPF_migrating, 
+            printk("d%uv%d runnbl=%d proc=%d pf=%lu\n", 
prev->domain->domain_id, prev->vcpu_id,
+                   vcpu_runnable(prev), prev->processor, prev->pause_flags);
+        BUG_ON(!test_bit(_VPF_migrating, &prev->pause_flags));

Output is:

(XEN) d10v0 runnbl=1 proc=31 pf=0
(XEN) Xen BUG at schedule.c:1572

On CPU 16.

It is still the BUG_ON(!test_bit(VPF_migrating)) which is triggering (I
actually meant to get rid of that as well, but I forgot.)

So, it looks like before, we did not hit BUG_ON(vcpu_runnable(prev)),
while in this run, vcpu_runnable(prev) is 1. I mean, I know it's a
race, but... wow...

We are in here because VPF_migrating was set, but it must be getting
cleared, concurrently with us, at about this time.

We are on CPU 16, inside context_saved(), and our 'prev' is d10v0. This
means its 'processor' should still be 16. But it's 31, so someone has
changed it already. I'm assuming it has been the vcpu_migrate() from
vcpu_set_affinity(). And this could very well be fine, but then, why we
also, when inside vcpu_migrate(), find VPF_migrating set?

I'll add more debugging to check if the vcpu is in a runqueue...

<<This happens because I choose it to happen!>> (Raistlin Majere)
Dario Faggioli, Ph.D, http://about.me/dario.faggioli
Software Engineer @ SUSE https://www.suse.com/

Attachment: signature.asc
Description: This is a digitally signed message part

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.