[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] crash in csched_load_balance after xl vcpu-pin
On Thu, 2018-04-12 at 17:38 +0200, Dario Faggioli wrote: > On Thu, 2018-04-12 at 15:15 +0200, Dario Faggioli wrote: > > On Thu, 2018-04-12 at 14:45 +0200, Olaf Hering wrote: > > > > > > dies after the first iteration. > > > > > > BUG_ON(!test_bit(_VPF_migrating, &prev->pause_flags)); > > > > > Update. I replaced this: > Olaf, new patch! :-) FTR, a previous version of this (where I was not printing smp_processor_id() and prev->is_running), produced the output that I am attaching below. Looks to me like, while on the crashing CPU, we are here [*]: void context_saved(struct vcpu *prev) { ... if ( unlikely(prev->pause_flags & VPF_migrating) ) { unsigned long flags; spinlock_t *lock = vcpu_schedule_lock_irqsave(prev, &flags); if (vcpu_runnable(prev) || !test_bit(_VPF_migrating, &prev->pause_flags)) printk("CPU %u: d%uv%d isr=%u runnbl=%d proc=%d pf=%lu orq=%d csf=%u\n", smp_processor_id(), prev->domain->domain_id, prev->vcpu_id, prev->is_running, vcpu_runnable(prev), prev->processor, prev->pause_flags, SCHED_OP(vcpu_scheduler(prev), onrunq, prev), SCHED_OP(vcpu_scheduler(prev), csflags, prev)); [*] if ( prev->runstate.state == RUNSTATE_runnable ) vcpu_runstate_change(prev, RUNSTATE_offline, NOW()); BUG_ON(curr_on_cpu(prev->processor) == prev); SCHED_OP(vcpu_scheduler(prev), sleep, prev); vcpu_schedule_unlock_irqrestore(lock, flags, prev); vcpu_migrate(prev); } } On the "other CPU", we might be around here [**]: static void vcpu_migrate(struct vcpu *v) { ... if ( v->is_running || !test_and_clear_bit(_VPF_migrating, &v->pause_flags) ) { sched_spin_unlock_double(old_lock, new_lock, flags); return; } vcpu_move_locked(v, new_cpu); sched_spin_unlock_double(old_lock, new_lock, flags); [**] if ( old_cpu != new_cpu ) sched_move_irqs(v); /* Wake on new CPU. */ vcpu_wake(v); } (XEN) d10v1 runnbl=0 proc=22 pf=1 orq=0 csf=4 (XEN) d10v0 runnbl=1 proc=20 pf=0 orq=0 csf=4 (XEN) d10v0 runnbl=1 proc=25 pf=0 orq=0 csf=4 (XEN) d10v2 runnbl=1 proc=31 pf=0 orq=0 csf=4 (XEN) d10v2 runnbl=1 proc=10 pf=0 orq=1 csf=0 (XEN) d10v0 runnbl=1 proc=30 pf=0 orq=0 csf=4 (XEN) d10v0 runnbl=1 proc=15 pf=0 orq=0 csf=4 (XEN) d10v3 runnbl=1 proc=13 pf=0 orq=1 csf=0 (XEN) d10v2 runnbl=1 proc=39 pf=0 orq=0 csf=4 (XEN) d10v3 runnbl=1 proc=32 pf=0 orq=0 csf=4 (XEN) d10v2 runnbl=1 proc=20 pf=0 orq=0 csf=4 (XEN) d10v2 runnbl=1 proc=20 pf=0 orq=0 csf=4 (XEN) d10v1 runnbl=0 proc=26 pf=1 orq=0 csf=4 (XEN) d10v3 runnbl=1 proc=16 pf=0 orq=0 csf=4 (XEN) Xen BUG at sched_credit.c:877 (XEN) ----[ Xen-4.11.20180411T100655.82540b66ce-180412155659 x86_64 debug=y Not tainted ]---- (XEN) CPU: 16 (XEN) RIP: e008:[<ffff82d08022c84d>] sched_credit.c#csched_vcpu_migrate+0x52/0x54 (XEN) RFLAGS: 0000000000010006 CONTEXT: hypervisor (d6v0) (XEN) rax: ffff8300779c9000 rbx: 0000000000000012 rcx: ffff830adac719f0 (XEN) rdx: 0000000000000012 rsi: ffff8300779b2000 rdi: 00000033ff8bb000 (XEN) rbp: ffff83087cfb7ce8 rsp: ffff83087cfb7ce8 r8: 0000000000000010 (XEN) r9: 0000ffff0000ffff r10: 00ff00ff00ff00ff r11: 0f0f0f0f0f0f0f0f (XEN) r12: ffff83047fe82188 r13: ffff83047fe70188 r14: ffff82d0805c7180 (XEN) r15: ffff8300779b2000 cr0: 000000008005003b cr4: 00000000000026e0 (XEN) cr3: 0000000f8404b000 cr2: 00007f18dfeca000 (XEN) fsb: 0000000000000000 gsb: 0000000000000000 gss: 0000000000000000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: 0000 cs: e008 (XEN) Xen code around <ffff82d08022c84d> (sched_credit.c#csched_vcpu_migrate+0x52/0x54): (XEN) 5d c3 0f 0b 0f 0b 0f 0b <0f> 0b 55 48 89 e5 48 8d 05 26 a9 39 00 48 8b 57 (XEN) Xen stack trace from rsp=ffff83087cfb7ce8: (XEN) ffff83087cfb7cf8 ffff82d080239419 ffff83087cfb7d68 ffff82d08023a8d8 (XEN) ffff82d0805c7160 ffff82d0805c7180 01ff83087cfb7d78 0000001200000010 (XEN) 0000000000000092 0000000000000296 0000000000000003 ffff8300779b2000 (XEN) ffff83047fe82188 0000000000000292 0000000000000004 ffff82d0805b2520 (XEN) ffff83087cfb7db8 ffff82d08023c795 ffff83087cfb7d98 ffff8300779b2000 (XEN) ffff83087cfb7db8 ffff8300779c9000 ffff8300779b2000 ffff830ad6463000 (XEN) 0000000000000010 ffff830adad26000 ffff83087cfb7e08 ffff82d08027a538 (XEN) ffff83087cfb7dd8 ffff82d0802a8510 ffff83087cfb7e08 ffff8300779b2000 (XEN) ffff8300779c9000 ffff83047fe82188 0000008405ba3022 0000000000000003 (XEN) ffff83087cfb7e98 ffff82d0802397a9 ffff8300779b2560 ffff83047fe821a0 (XEN) 0000001000fb7e58 ffff83047fe82180 ffff82d080328ba1 ffff8300779b2000 (XEN) ffff830adad26000 ffff8300779c9000 0000000001c9c380 ffff82d080302000 (XEN) ffff8300779b2000 ffff82d08059c480 ffff82d08059bc80 ffffffffffffffff (XEN) ffff83087cfb7fff ffff82d0805a3c80 ffff83087cfb7ed8 ffff82d08023d552 (XEN) ffff82d080328ba1 ffff8300779b2000 ffff8300779c9000 ffff830adad26000 (XEN) 0000000000000010 ffff830ad6463000 ffff83087cfb7ee8 ffff82d08023d5c5 (XEN) ffff83087cfb7db8 ffff82d080328d6b ffffffff81c00000 ffffffff81c00000 (XEN) ffffffff81c00000 0000000000000000 0000000000000000 ffffffff81d4c180 (XEN) 0000000000000008 000000470cb96de6 0000000000000001 0000000000000000 (XEN) ffffffff81020e50 0000000000000000 0000000000000000 0000000000000000 (XEN) Xen call trace: (XEN) [<ffff82d08022c84d>] sched_credit.c#csched_vcpu_migrate+0x52/0x54 (XEN) [<ffff82d080239419>] schedule.c#vcpu_move_locked+0x42/0xcc (XEN) [<ffff82d08023a8d8>] schedule.c#vcpu_migrate+0x210/0x23b (XEN) [<ffff82d08023c795>] context_saved+0x21e/0x461 (XEN) [<ffff82d08027a538>] context_switch+0xe9/0xf67 (XEN) [<ffff82d0802397a9>] schedule.c#schedule+0x306/0x6ab (XEN) [<ffff82d08023d552>] softirq.c#__do_softirq+0x71/0x9a (XEN) [<ffff82d08023d5c5>] do_softirq+0x13/0x15 (XEN) [<ffff82d080328d6b>] vmx_asm_do_vmentry+0x2b/0x30 -- <<This happens because I choose it to happen!>> (Raistlin Majere) ----------------------------------------------------------------- Dario Faggioli, Ph.D, http://about.me/dario.faggioli Software Engineer @ SUSE https://www.suse.com/ Attachment:
context-save-race-debug.patch Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |