[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen 4.3 + tmem = Xen BUG at domain_page.c:143
On Tue, Jun 11, 2013 at 4:30 PM, konrad wilk <konrad.wilk@xxxxxxxxxx> wrote: > I think this is a more subtle bug. > I applied a debug patch (see attached) and with the help of it and the logs: > > (XEN) domain_page.c:160:d1 mfn (1ebe96) -> 6 idx: 32(i:1,j:0), branch:1 > (XEN) domain_page.c:166:d1 [0] idx=26, mfn=0x1ebcd8, refcnt: 0 > (XEN) domain_page.c:166:d1 [1] idx=12, mfn=0x1ebcd9, refcnt: 0 > (XEN) domain_page.c:166:d1 [2] idx=2, mfn=0x210e9a, refcnt: 0 > (XEN) domain_page.c:166:d1 [3] idx=14, mfn=0x210e9b, refcnt: 0 > (XEN) domain_page.c:166:d1 [4] idx=7, mfn=0x210e9c, refcnt: 0 > (XEN) domain_page.c:166:d1 [5] idx=10, mfn=0x210e9d, refcnt: 0 > (XEN) domain_page.c:166:d1 [6] idx=5, mfn=0x210e9e, refcnt: 0 > (XEN) domain_page.c:166:d1 [7] idx=13, mfn=0x1ebe97, refcnt: 0 > (XEN) Xen BUG at domain_page.c:169 > > (XEN) ----[ Xen-4.3-unstable x86_64 debug=y Not tainted ]---- > (XEN) CPU: 3 > (XEN) RIP: e008:[<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 > > (XEN) RFLAGS: 0000000000010046 CONTEXT: hypervisor > (XEN) rax: 0000000000000000 rbx: ffff8300c68f9000 rcx: 0000000000000000 > (XEN) rdx: ffff8302125b2020 rsi: 000000000000000a rdi: ffff82c4c027a6e8 > (XEN) rbp: ffff8302125afcc8 rsp: ffff8302125afc48 r8: 0000000000000004 > (XEN) r9: 0000000000000004 r10: 0000000000000004 r11: 0000000000000001 > (XEN) r12: ffff83022e2ef000 r13: 00000000001ebe96 r14: 0000000000000020 > (XEN) r15: ffff8300c68f9080 cr0: 0000000080050033 cr4: 00000000000426f0 > (XEN) cr3: 0000000209541000 cr2: ffffffffff600400 > > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 > (XEN) Xen stack trace from rsp=ffff8302125afc48: > (XEN) 00000000001ebe97 0000000000000000 0000000000000000 ffff830200000001 > (XEN) ffff8302125afcc8 ffff82c400000000 00000000001ebe97 000000080000000d > (XEN) ffff83022e2ef2d8 0000000000000286 ffff82c4c0127b6b ffff83022e2ef000 > (XEN) ffff82e003d7d2c0 ffff8302125afd60 00000000001ebe96 0000000000000000 > (XEN) ffff8302125afd38 ffff82c4c01373de 0000000000000000 ffffffffffffffff > (XEN) 0000000000000001 ffff8302125afd58 ffff83022e2ef2d8 0000000000000286 > > (XEN) 0000000000000027 0000000000000000 0000000000001000 0000000000000000 > (XEN) 0000000000000000 00000000001ebe96 ffff8302125afd98 ffff82c4c01377c4 > (XEN) 0000000000000000 ffff820040017000 ffff82e003d7d2c0 00000000001ebe96 > (XEN) ffff8302125afd98 ffff830210ecf390 00000000fffffff4 ffff820040009010 > (XEN) ffff820040000f50 ffff83022e2f0c90 ffff8302125afe18 ffff82c4c0135929 > (XEN) 000000160000001e ffff820040000f50 0000000000000000 00000000001ebe96 > (XEN) 0000000000000000 0000000000000000 0000a2f6125afe28 ffff8302125afe00 > (XEN) 0000001675f02b51 ffff83022e2f0c90 ffff830210ecf390 0000000000000000 > (XEN) 0000000000000001 0000000000000065 ffff8302125afef8 ffff82c4c0136510 > (XEN) ffff830200001000 0000000000000000 ffff8302125afe90 255ece02125b2040 > (XEN) 00000003125afe68 00000016742667d1 ffff8302125b2100 0000003d52299000 > (XEN) ffff8300c68f9000 0000000001c9c380 ffff8302125b2100 ffff8302125b1808 > (XEN) 0000000000000004 0000000000000004 0000000000000000 0000000000000000 > (XEN) 000000000000a2f6 0000000000000000 00000000001ebe96 ffff82c4c0126e77 > (XEN) Xen call trace: > (XEN) [<ffff82c4c01606a7>] map_domain_page+0x61d/0x6e1 > > (XEN) [<ffff82c4c01373de>] cli_get_page+0x15e/0x17b > (XEN) [<ffff82c4c01377c4>] tmh_copy_from_client+0x150/0x284 > (XEN) [<ffff82c4c0135929>] do_tmem_put+0x323/0x5c4 > (XEN) [<ffff82c4c0136510>] do_tmem_op+0x5a0/0xbd0 > (XEN) [<ffff82c4c022391b>] syscall_enter+0xeb/0x145 > > (XEN) > (XEN) > (XEN) **************************************** > (XEN) Panic on CPU 3: > (XEN) Xen BUG at domain_page.c:169 > > (XEN) **************************************** > (XEN) > (XEN) Manual reset required ('noreboot' specified) > > It looks as if the path that is taken is: > > 110 idx = find_next_zero_bit(dcache->inuse, dcache->entries, > dcache->cursor); > 111 if ( unlikely(idx >= dcache->entries) ) > 112 { > > 115 /* /First/, clean the garbage map and update the inuse list. */ > 116 for ( i = 0; i < BITS_TO_LONGS(dcache->entries); i++ ) > 117 { > 118 dcache->inuse[i] &= ~xchg(&dcache->garbage[i], 0); > 119 accum |= ~dcache->inuse[i]; > > Here computes the accum > 120 } > 121 > 122 if ( accum ) > 123 idx = find_first_zero_bit(dcache->inuse, dcache->entries) > > Ok, finds the idx (32), > 124 else > 125 { > .. does not go here. > 142 } > 143 BUG_ON(idx >= dcache->entries); > > And hits the BUG_ON(). > > But I am not sure if that is appropriate. Perhaps the BUG_ON was meant as a > check > for the loop (lines 128 -> 141) - in case it looped around and never found > an empty place. > But if that is the condition then that would also look suspect as it might > have found an > empty hash entry and the idx would still end up being 32. Right -- it is really curious that "accum |= ~dcache->inuse[x]" managed to be non-zero, while find_first_zero_bit() goes off the end (as it seems). It seems like you should add a printk in the first loop: if(~dcache->inuse[i]) printk(...); Also, I don't think you've printed what dcache->entries is -- is it 32? -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |