[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Hypervisor crash(!) on xl cpupool-numa-split


  • To: Andre Przywara <andre.przywara@xxxxxxx>
  • From: Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>
  • Date: Mon, 31 Jan 2011 08:04:45 +0100
  • Cc: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Keir Fraser <keir@xxxxxxx>
  • Delivery-date: Sun, 30 Jan 2011 23:05:18 -0800
  • Domainkey-signature: s=s1536a; d=ts.fujitsu.com; c=nofws; q=dns; h=X-SBRSScore:X-IronPort-AV:Received:X-IronPort-AV: Received:Received:Message-ID:Date:From:Organization: User-Agent:MIME-Version:To:CC:Subject:References: In-Reply-To:Content-Type:Content-Transfer-Encoding; b=bssJGN1b8WIm6SrR1C4//vjciL4/8bh7fbLgj8I7yadY2bMzn45UgX+4 GcQlSxipzHHetTf9HIPkoNOOV2KdZjYWS62WZ+Iu9o2YkOBLq+CyHyToL Iz53XoAt/tfSh5Eb75rfN7MCLWO7Spu57Rti83tvZDj0dHSM0tLkeTF6R uNgtQ+C0HK7mxhQJstAklPOQKRCWXfBEYUwOz7NuDjZh7aQmELdxwBe+v S+Gogoq+MUo0AkzSLr6an2OF03/3s;
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

On 01/28/11 14:14, Andre Przywara wrote:

Do I understand correctly?
No crash with only dom0_max_vcpus= and no crash with only dom0_mem= ?
Yes, see my previous mail to George.


Could you try this patch?
Ok, the crash dump is as follows:

Hmm, is the new crash reproducable as well?
Seems not to be directly related to my diagnosis patch...

Currently I have no NUMA machine available. I tried to use numa=fake=...
boot parameter, but this seems to fake only NUMA memory nodes, all cpus are
still in node 0:

(XEN) 'u' pressed -> dumping numa info (now-0x120:5D5E0203)
(XEN) idx0 -> NODE0 start->0 size->524288
(XEN) phys_to_nid(0000000000001000) -> 0 should be 0
(XEN) idx1 -> NODE1 start->524288 size->524288
(XEN) phys_to_nid(0000000080001000) -> 1 should be 1
(XEN) idx2 -> NODE2 start->1048576 size->524288
(XEN) phys_to_nid(0000000100001000) -> 2 should be 2
(XEN) idx3 -> NODE3 start->1572864 size->1835008
(XEN) phys_to_nid(0000000180001000) -> 3 should be 3
(XEN) CPU0 -> NODE0
(XEN) CPU1 -> NODE0
(XEN) CPU2 -> NODE0
(XEN) CPU3 -> NODE0
(XEN) Memory location of each domain:
(XEN) Domain 0 (total: 3003121):
(XEN)     Node 0: 433864
(XEN)     Node 1: 258522
(XEN)     Node 2: 514315
(XEN)     Node 3: 1796420

I suspect a problem with the __cpuinit stuff overwriting some node info.
Andre, could you check this? I hope to reproduce your problem on my machine.

(XEN) Xen BUG at sched_credit.c:384
(XEN) ----[ Xen-4.1.0-rc2-pre x86_64 debug=y Not tainted ]----
(XEN) CPU: 2
(XEN) RIP: e008:[<ffff82c480117fa0>] csched_alloc_pdata+0x146/0x17f
(XEN) RFLAGS: 0000000000010093 CONTEXT: hypervisor
(XEN) rax: ffff830434322000 rbx: ffff830434418748 rcx: 0000000000000024
(XEN) rdx: ffff82c4802d3ec0 rsi: 0000000000000003 rdi: ffff8304343c9100
(XEN) rbp: ffff83043457fce8 rsp: ffff83043457fca8 r8: 0000000000000001
(XEN) r9: ffff830434418748 r10: ffff82c48021a0a0 r11: 0000000000000286
(XEN) r12: 0000000000000024 r13: ffff83123a3b2b60 r14: ffff830434418730
(XEN) r15: 0000000000000024 cr0: 000000008005003b cr4: 00000000000006f0
(XEN) cr3: 00000008061df000 cr2: ffff8817a21f87a0
(XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
(XEN) Xen stack trace from rsp=ffff83043457fca8:
(XEN) ffff83043457fcb8 ffff83123a3b2b60 0000000000000286 0000000000000024
(XEN) ffff830434418820 ffff83123a3b2a70 0000000000000024 ffff82c4802b0880
(XEN) ffff83043457fd58 ffff82c48011fa63 ffff82f60102aa80 0000000000081554
(XEN) ffff8300c7cfa000 0000000000000000 0000400000000000 ffff82c480248e00
(XEN) 0000000000000002 0000000000000024 ffff830434418820 0000000000305000
(XEN) ffff82c4802550e4 ffff82c4802b0880 ffff83043457fd78 ffff82c48010188c
(XEN) ffff83043457fe40 0000000000000024 ffff83043457fdb8 ffff82c480101b94
(XEN) ffff83043457fdb8 ffff82c4801836f2 fffffffe00000286 ffff83043457ff18
(XEN) 0000000002170004 0000000000305000 ffff83043457fef8 ffff82c480125281
(XEN) ffff83043457fdd8 0000000180153c9d 0000000000000000 ffff82c4801068f8
(XEN) 0000000000000296 ffff8300c7e0a1c8 aaaaaaaaaaaaaaaa 0000000000000000
(XEN) ffff88007d1ac170 ffff88007d1ac170 ffff83043457fef8 ffff82c480113d8a
(XEN) ffff83043457fe78 ffff83043457fe88 0000000800000012 0000000600000004
(XEN) 0000000000000000 ffffffff00000024 0000000000000000 00007fac2e0e5a00
(XEN) 0000000002170000 0000000000000000 0000000000000000 ffffffffffffffff
(XEN) 0000000000000000 0000000000000080 000000000000002f 0000000002170004
(XEN) 0000000002172004 0000000002174004 00007fff878f1c80 0000000000000033
(XEN) ffff83043457fed8 ffff8300c7e0a000 00007fff878f1b30 0000000000305000
(XEN) 0000000000000003 0000000000000003 00007cfbcba800c7 ffff82c480207dd8
(XEN) ffffffff8100946a 0000000000000023 0000000000000003 0000000000000003
(XEN) Xen call trace:
(XEN) [<ffff82c480117fa0>] csched_alloc_pdata+0x146/0x17f
(XEN) [<ffff82c48011fa63>] schedule_cpu_switch+0x75/0x1eb
(XEN) [<ffff82c48010188c>] cpupool_assign_cpu_locked+0x44/0x8b
(XEN) [<ffff82c480101b94>] cpupool_do_sysctl+0x1fb/0x461
(XEN) [<ffff82c480125281>] do_sysctl+0x921/0xa30
(XEN) [<ffff82c480207dd8>] syscall_enter+0xc8/0x122
(XEN)
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 2:
(XEN) Xen BUG at sched_credit.c:384
(XEN) ****************************************
(XEN)
(XEN) Reboot in five seconds...


Juergen

--
Juergen Gross                 Principal Developer Operating Systems
TSP ES&S SWE OS6                       Telephone: +49 (0) 89 3222 2967
Fujitsu Technology Solutions              e-mail: juergen.gross@xxxxxxxxxxxxxx
Domagkstr. 28                           Internet: ts.fujitsu.com
D-80807 Muenchen                 Company details: ts.fujitsu.com/imprint.html

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.