[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split
Juergen Gross wrote: Andre, Stephan, could you give the attached patch a try? It moves the cpu assigning/unassigning into a tasklet always executed on the cpu to be moved. This should avoid critical races. Done. I checked it twice, but sadly it does not fix the issue. It still BUGs: (XEN) Xen BUG at sched_credit.c:990 (XEN) ----[ Xen-4.1.0-rc3-pre x86_64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82c480118208>] csched_acct+0x11f/0x419 (XEN) RFLAGS: 0000000000010006 CONTEXT: hypervisor (XEN) rax: 0000000000000010 rbx: 0000000000000f00 rcx: 0000000000000100 (XEN) rdx: 0000000000001000 rsi: ffff830437ffa600 rdi: 0000000000000010 (XEN) rbp: ffff82c480297e10 rsp: ffff82c480297d80 r8: 0000000000000100 (XEN) r9: 0000000000000006 r10: ffff82c4802d4100 r11: 0000017322fea49a (XEN) r12: ffff830437ffa5e0 r13: ffff82c4801180e9 r14: ffff83043399f018 (XEN) r15: ffff830434321ec0 cr0: 000000008005003b cr4: 00000000000006f0 (XEN) cr3: 00000000c7c9c000 cr2: 0000000001ec8048 (XEN) ds: 002b es: 002b fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff82c480297d80: (XEN) ffff82c480297f18 fffffed4c7cd6000 ffff830000000eff ffff830437ffa5e0 (XEN) ffff830437ffa5e8 ffff82c480297df8 ffff830437ffa5e0 0000000000000282 (XEN) ffff830437ffa5e8 00001c200000000f 00000f0000000f00 0000000000000000 (XEN) ffff82c400000000 ffff82c4802d3f80 ffff830437ffa5e0 ffff82c4801180e9 (XEN) ffff83043399f018 ffff83043399f010 ffff82c480297e40 ffff82c480126044 (XEN) 0000000000000002 ffff830437ffa600 ffff82c4802d3f80 00000173010849b7 (XEN) ffff82c480297e90 ffff82c480126369 ffff82c48024aea0 ffff82c4802d3f80 (XEN) ffff83043399f010 0000000000000000 0000000000000000 ffff82c4802b0880 (XEN) ffff82c480297f18 ffffffffffffffff ffff82c480297ed0 ffff82c480123437 (XEN) ffff8300c7e1e0f8 ffff82c480297f18 ffff82c48024aea0 ffff82c480297f18 (XEN) 0000017301008665 ffff82c4802d3ec0 ffff82c480297ee0 ffff82c4801234b2 (XEN) ffff82c480297f10 ffff82c4801564f5 0000000000000000 ffff8300c7cd6000 (XEN) 0000000000000000 ffff8300c7e1e000 ffff82c480297d48 0000000000000000 (XEN) 0000000000000000 0000000000000000 ffffffff81a69060 ffff8817a8553f10 (XEN) ffff8817a8553fd8 0000000000000246 ffff8817a8553e80 ffff880000000001 (XEN) 0000000000000000 0000000000000000 ffffffff810093aa 000000000000e030 (XEN) 00000000deadbeef 00000000deadbeef 0000010000000000 ffffffff810093aa (XEN) 000000000000e033 0000000000000246 ffff8817a8553ef8 000000000000e02b (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 ffff8300c7cd6000 0000000000000000 0000000000000000 (XEN) Xen call trace: (XEN) [<ffff82c480118208>] csched_acct+0x11f/0x419 (XEN) [<ffff82c480126044>] execute_timer+0x4e/0x6c (XEN) [<ffff82c480126369>] timer_softirq_action+0xf2/0x245 (XEN) [<ffff82c480123437>] __do_softirq+0x88/0x99 (XEN) [<ffff82c4801234b2>] do_softirq+0x6a/0x7a (XEN) [<ffff82c4801564f5>] idle_loop+0x6a/0x6f (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) Xen BUG at sched_credit.c:990 (XEN) **************************************** (XEN) (XEN) Reboot in five seconds...Stephan had created more printk debug patches, we will summarize the results soon. Regards, Andre. Regarding Stephans rant: You should be aware that the main critical sections are only in the tasklets. The locking in the main routines is needed only to avoid the cpupool to be destroyed in between. I'm not sure whether the master_ticker patch is still needed. It seems to break something, as my machine hung up after several 100 cpu moves (without the new patch). I'm still investigating this problem. Juergen -- Andre Przywara AMD-Operating System Research Center (OSRC), Dresden, Germany _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |