[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin
On Fri, 2018-04-13 at 08:23 +0200, Olaf Hering wrote: > Am Thu, 12 Apr 2018 19:25:43 +0200 > schrieb Dario Faggioli <dfaggioli@xxxxxxxx>: > > > Olaf, new patch! :-) > > BUG_ON(__vcpu_on_runq(CSCHED_VCPU(vc))); > Thanks! > (XEN) CPU 36: d10v1 isr=0 runnbl=1 proc=36 pf=0 orq=0 csf=4 > So, FTR: - CPU is smp_processor_id() - dXvY is prev, in context_saved() - isr is prev->is_running - runnbl is vcpu_runnable(prev) - proc is prev->processor - pf is pref->pause_flags - orq is __vcpu_on_runq(CSCHED_VCPU(prev)) (coming from sched_credit.c) - csf is CSCHED_VCPU(prev)->flags csf = 4 is CSCHED_FLAG_VCPU_MIGRATING, which means someone is calling vcpu_migrate() on prev, on some other processor (presumably via vcpu_set_affinity()) and is around here: static void vcpu_migrate(struct vcpu *v) { ... if ( v->is_running || !test_and_clear_bit(_VPF_migrating, &v->pause_flags) ) { sched_spin_unlock_double(old_lock, new_lock, flags); return; } vcpu_move_locked(v, new_cpu); sched_spin_unlock_double(old_lock, new_lock, flags); [**] if ( old_cpu != new_cpu ) sched_move_irqs(v); /* Wake on new CPU. */ vcpu_wake(v); } I.e., SCHED_OP(pick_cpu) has been called already, but not vcpu_wake(). We must be past the sched_spin_unlock_double() because, on this processor (i.e., CPU 31 in this crash), we are, while printing, _inside_ a critical section on prev's scheduler lock. > (XEN) CPU 33: d10v2 isr=0 runnbl=0 proc=33 pf=1 orq=0 csf=4 > (XEN) CPU 20: d10v2 isr=0 runnbl=1 proc=20 pf=0 orq=0 csf=4 > (XEN) CPU 32: d10v0 isr=0 runnbl=1 proc=32 pf=0 orq=0 csf=4 > (XEN) CPU 33: d10v0 isr=0 runnbl=1 proc=12 pf=0 orq=0 csf=4 > (XEN) CPU 36: d10v0 isr=0 runnbl=1 proc=36 pf=0 orq=0 csf=4 > (XEN) CPU 31: d10v0 isr=0 runnbl=1 proc=31 pf=0 orq=0 csf=4 > (XEN) Xen BUG at sched_credit.c:877 > (XEN) ----[ Xen-4.11.20180411T100655.82540b66ce- > 180413055758 x86_64 debug=y Not tainted ]---- > (XEN) CPU: 31 > Right, so, in this case, the vcpu_migrate()->SCHED_OP(pick_cpu) did not change prev->processor. That could very well have happened. This just means that, if it weren't for the BUG_ON added in csched_vcpu_migrate() by this patch, this iteration would not have crashed in csched_load_balance(). However, in previous report, we have seen a situation where prev->processor was 31 on CPU 16. Fact is, VPF_migrating is 0 right now, for prev, which corroborates the theory that we are at [*] point, in vcpu_migrate() on the other CPU. In fact, it was 1, but !test_and_clear_bit() has been called to reset it. However, in order for us, on this CPU, to actually execute sched_move_locked(), like we do: > (XEN) Xen call trace: > (XEN) [<ffff82d08022c84d>] > sched_credit.c#csched_vcpu_migrate+0x52/0x54 > (XEN) [<ffff82d080239419>] schedule.c#vcpu_move_locked+0x42/0xcc > It means that someone raise VPF_migrating again! > (XEN) [<ffff82d08023a8d8>] schedule.c#vcpu_migrate+0x210/0x23b > (XEN) [<ffff82d08023c7ad>] context_saved+0x236/0x479 > (XEN) [<ffff82d08027a558>] context_switch+0xe9/0xf67 > (XEN) [<ffff82d0802397a9>] schedule.c#schedule+0x306/0x6ab > (XEN) [<ffff82d08023d56a>] softirq.c#__do_softirq+0x71/0x9a > (XEN) [<ffff82d08023d5dd>] do_softirq+0x13/0x15 > (XEN) [<ffff82d080328d8b>] vmx_asm_do_vmentry+0x2b/0x30 > (XEN) **************************************** > (XEN) Panic on CPU 31: > (XEN) Xen BUG at sched_credit.c:877 > (XEN) **************************************** > Now, VPF_migrating is raise in the following circumstances: * in __runq_tickle(): I actually was about to pinpoint this as the problem, but then I realized that, when calling __runq_tickle(prev), in vcpu_wake() (called by vcpu_migrate()), we do not set the bit on prev itself, but on the currently running vcpu of prev->processor. And a vcpu that is in per_cpu(schedule_data, <CPU>).curr, can't also be prev in (any) context_saved(), I think. * in csched_vcpu_acct(): we set the flag on CSCHED_VCPU(current). I may be wrong, but I don't immediatly see why we use current here, instead than curr_on_cpu(cpu). Yet, I think that, similarly to above, current can't be prev. Still, I may send a "Just in case"^TM patch... :-P * in vcpu_force_reschedule(): it's used in shim code (well... :-) and in VCPUOP_set_periodic_timer(). But it only sets the flag if prev->is_running is 1, which is not. Besides, don't most guests use singleshot timer only these days? * in cpu_disable_scheduler(): no. Just no. * in vcpu_set_affinity(): well, it looks to me that, either, a) we use the set of the bit in here to actually enter the if() in context_saved(), which is a precondition for the race, and then we are already past that or, b) things just work. Will think more... * in vcpu_pin_override(): again, no.... I think? So, thoughts? :-) Dario -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Software Engineer @ SUSE https://www.suse.com/ Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |