[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin
On Thu, 2018-04-12 at 09:38 +0000, George Dunlap wrote: > > On Apr 11, 2018, at 10:31 PM, Dario Faggioli <raistlin@xxxxxxxx> > > wrote: > > (XEN) Xen BUG at sched_credit.c:876 > > (XEN) ----[ Xen-4.11.20180410T125709.50f8ba84a5- > > 7.bug1087289_411 x86_64 debug=y Not tainted ]---- > > (XEN) CPU: 108 > > (XEN) RIP: e008:[<ffff82d080229ab4>] > > sched_credit.c#csched_vcpu_migrate+0x27/0x54 > > (XEN) RFLAGS: 0000000000010006 CONTEXT: hypervisor > > ... > > (XEN) Xen call trace: > > (XEN) [<ffff82d080229ab4>] > > sched_credit.c#csched_vcpu_migrate+0x27/0x54 > > (XEN) [<ffff82d080236348>] schedule.c#vcpu_move_locked+0xbb/0xc2 > > (XEN) [<ffff82d08023764c>] schedule.c#vcpu_migrate+0x226/0x25b > > (XEN) [<ffff82d080239367>] context_saved+0x95/0x9c > > (XEN) [<ffff82d08027797d>] context_switch+0xe66/0xeb0 > > (XEN) [<ffff82d080236943>] schedule.c#schedule+0x5f4/0x627 > > (XEN) [<ffff82d080239f15>] softirq.c#__do_softirq+0x85/0x90 > > (XEN) [<ffff82d080239f6a>] do_softirq+0x13/0x15 > > (XEN) [<ffff82d08031f5db>] vmx_asm_do_vmentry+0x2b/0x30 > > > > So, really *exactly* the same. Ok, thanks. > > But this doesn’t make any sense. If you applied Dario’s ‘fix’ patch, > then context_saved() should have *just* called vcpu_sleep_nosync() > before calling vcpu_migrate(). The VPF_migrating flag should still > be set, so it should have called csched_vcpu_sleep(); and sd->curr > should have been changed to be != prev way back in schedule(), so > csched_vcpu_sleep() should have called runq_remove(). > Well, you've just described me, banging my head on my desk, since yesterday afternoon. :-P > It’s probably worth asking the obvious question: Are you sure the > “fix” patch is actually applied (in addition to the new “debug” > patch)? :-) > > If so, then maybe it’s time to open-code vcpu_sleep_nosync() there in > context_saved(), to try to figure out where our understanding of what > *should* happen is incorrect. > Ehm... Can you please stop reading my mind? It's annoying. :-D Well, I guess we can say: "great minds think alike". :-P Olaf, new patch. Please, remove _everything_ and apply _only_ this one. As George is saying, the vcpu just can't be in the runqueue, unless: 1) vcpu_sleep_nosync() did not remove it 2) someone is putting it back there Let's check 1 first. Regards, Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Software Engineer @ SUSE https://www.suse.com/ Attachment:
context-save-race-debug.patch Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |