[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: [PATCH] mem_sharing: fix race condition of nominate and unshare



I'll do the check. Thanks.
Well, during the test, I still have another two failures
 
1) when all domain are destroyed, the handle in hash table are not decrease to 0 sometimes.
I print the handle count, most of time it is 0 after all domain destroyed.
(XEN) ===>total handles 2 total gfns 2 next_handle: 713269
 
2)  set_shared_p2m_entry failed
 745     list_for_each_safe(le, te, &ce->gfns)
 746     {
 747         gfn = list_entry(le, struct gfn_info, list);
 748         /* Get the source page and type, this should never fail
 749          * because we are under shr lock, and got non-null se */
 750         BUG_ON(!get_page_and_type(spage, dom_cow, PGT_shared_page));
 751         /* Move the gfn_info from ce list to se list */
 752         list_del(&gfn->list);
 753         d = get_domain_by_id(gfn->domain);
 754 //      mem_sharing_debug_gfn(d, gfn->gfn);
 755 &n bsp;       BUG_ON(!d);
 756         BUG_ON(set_shared_p2m_entry(d, gfn->gfn, se->mfn) == 0);
 757         put_domain(d);
 758         list_add(&gfn->list, &se->gfns);
 759         put_page_and_type(cpage);
 760 //      mem_sharing_debug_gfn(d, gfn->gfn);  
 
 
(XEN) printk: 33 messages suppressed.
(XEN) p2m.c:2442:d0 set_mmio_p2m_entry: set_p2m_entry failed! mfn=0023dbb7
(XEN) Xen BUG at mem_sharing.c:756
(XEN) ----[ Xen-4.0.0  x86_64  debug=n  Not tainted ]----
(XEN) CPU:    0
(XEN) RIP:    e008:[<ffff82c4801bfd90>] mem_sharing_share_pages+0x370/0x3d0
(XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
(XEN) rax: 0000000000000000   rbx: ffff83040ed20000   rcx: 0000000000000092
(XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi: ffff82c48021eac4
(XEN) rbp: ffff8305a4bbe1b0   rsp: ffff82c48035fc58   r8:  0000000000000001
(XEN) r9:  0000000000000000   r10: 00000000fffffffb   r11: ffff82c4801318d0
(XEN) r12: ffff8305a4bbe1a0   r13: ffff8305a61d42a0   r14: ffff82f6047b76e0
(XEN) r15: ffff8304e5e918c8   cr0: 0000000 080050033   cr4: 00000000000026f0
(XEN) cr3: 00000005203fc000   cr2: 00000000027b8000


 
 
 
> Date: Thu, 20 Jan 2011 09:19:34 +0000
> From: Tim.Deegan@xxxxxxxxxx
> To: tinnycloud@xxxxxxxxxxx
> CC: xen-devel@xxxxxxxxxxxxxxxxxxx; juihaochiang@xxxxxxxxx
> Subject: Re: [PATCH] mem_sharing: fix race condition of nominate and unshare
>
> At 07:19 +0000 on 20 Jan (1295507976), MaoXiaoyun wrote:
> > Hi:
> >
> > The latest BUG in mem_sharing_alloc_page from mem_sharing_unshare_page.
> > I printed heap info, which shows plenty memory left.
> > Could domain be NULL during in unshare, or should it be locked by rcu_lock_domain_by_id ?
> >
>
> 'd' probably isn't NULL; more likely is that the domain is not allowed
> to have any more memory. You should look at the values of d->max_pages
> and d->tot_pages when the failure happens.
>
> Cheers.
>
> Tim.
>
> > -----------code------------
> > 422 extern void pa gealloc_info(unsigned char key);
> > 423 static struct page_info* mem_sharing_alloc_page(struct domain *d,
> > 424 unsigned long gfn,
> > 425 int must_succeed)
> > 426 {
> > 427 struct page_info* page;
> > 428 struct vcpu *v = current;
> > 429 mem_event_request_t req;
> > 430
> > 431 page = alloc_domheap_page(d, 0);
> > 432 if(page != NULL) return page;
> > 433
> > 434 memset(&req, 0, sizeof(req));
> > 435 if(must_succeed)
> > 436 {
> > 437 /* We do not support 'must_succeed' any more. External operations such
> > 438 * as grant table mappings may fail with OOM condition!
> > 439 */
> > 440 pagealloc_info('m');
> > 441 BUG();
> > 442 }
> >
> > -------------serial output-------
> > (XEN) Physical memory information:
> > (XEN) Xen heap: 0kB free
> > (X EN) heap[14]: 64480kB free
> > (XEN) heap[15]: 131072kB free
> > (XEN) heap[16]: 262144kB free
> > (XEN) heap[17]: 524288kB free
> > (XEN) heap[18]: 1048576kB free
> > (XEN) heap[19]: 1037128kB free
> > (XEN) heap[20]: 3035744kB free
> > (XEN) heap[21]: 2610292kB free
> > (XEN) heap[22]: 2866212kB free
> > (XEN) Dom heap: 11579936kB free
> > (XEN) Xen BUG at mem_sharing.c:441
> > (XEN) ----[ Xen-4.0.0 x86_64 debug=n Not tainted ]----
> > (XEN) CPU: 0
> > (XEN) RIP: e008:[<ffff82c4801c0531>] mem_sharing_unshare_page+0x681/0x790
> > (XEN) RFLAGS: 0000000000010282 CONTEXT: hypervisor
> > (XEN) rax: 0000000000000000 rbx: ffff83040092d808 rcx: 0000000000000096
> > (XEN) rdx: 000000000000000a rsi: 000000000000000a rdi: ffff82c48021eac4
> > (XEN) rbp: 0000000000000000 rsp: ffff82c48035f5e8 r8: 0000000000000001
> > (XE N) r9: 0000000000000001 r10: 00000000fffffff5 r11: 0000000000000008
> > (XEN) r12: ffff8305c61f3980 r13: ffff83040eff0000 r14: 000000000001610f
> > (XEN) r15: ffff82c48035f628 cr0: 000000008005003b cr4: 00000000000026f0
> > (XEN) cr3: 000000052bc4f000 cr2: ffff880120126e88
> > (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008
> > (XEN) Xen stack trace from rsp=ffff82c48035f5e8:
> > (XEN) ffff8305c61f3990 00018300bf2f0000 ffff82f604e6a4a0 000000002ab84078
> > (XEN) ffff83040092d7f0 00000000001b9c9c ffff8300bf2f0000 000000010eff0000
> > (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> > (XEN) 0000000000000000 0000000d0000010f ffff8305447ec000 000000000001610f
> > (XEN) 0000000000273525 ffff82c48035f724 ffff830502c705a0 ffff82f602c89a00
> > (XEN) ffff83040eff0000 ffff82c48010bfa9 ffff830572c5dbf0 000000000029e07f
> > (XEN) 0000000000000 000 ffff830572c5dbf0 000000008035fbe8 ffff82c48035f6f8
> > (XEN) 0000000100000002 ffff830572c5dbf0 ffff83063fc30000 ffff830572c5dbf0
> > (XEN) 0000035900000000 ffff88010d14bbe0 ffff880159e09000 00003f7e00000002
> > (XEN) ffffffffffff0032 ffff88010d14bbb0 ffff830438dfa920 0000000d8010a650
> > (XEN) 0000000000000100 ffff83063fc30000 ffff8305f9203730 ffffffffffffffea
> > (XEN) ffff88010d14bb70 0000000000000000 ffff88010d14bc10 ffff88010d14bbc0
> > (XEN) 0000000000000002 ffff82c48010da9b 0000000000000202 ffff82c48035fec8
> > (XEN) ffff82c48035f7c8 00000000801880af ffff83063fc30010 0000000000000000
> > (XEN) ffff82c400000008 ffff82c48035ff28 0000000000000000 ffff88010d14bbc0
> > (XEN) ffff880159e08000 0000000000000000 0000000000000000 00020000000002d7
> > (XEN) 00000000003f2b38 ffff8305b1f4b6b8 ffff8305b30f0000 ffff880159e09000
> > (XEN) 0000000000000000 0000000000000000 000200000000 008a 00000000003ed1f9
> > (XEN) ffff83063fc26450 ffff8305b30f0000 ffff880159e0a000 0000000000000000
> > (XEN) 0000000000000000 00020000000001fa 000000000029e2ba ffff83063fc26fd0
> > (XEN) Xen call trace:
> > (XEN) [<ffff82c4801c0531>] mem_sharing_unshare_page+0x681/0x790
> > (XEN) [<ffff82c48010bfa9>] gnttab_map_grant_ref+0xbf9/0xe30
> > (XEN) [<ffff82c48010da9b>] do_grant_table_op+0x14b/0x1080
> > (XEN) [<ffff82c48010fb44>] do_xen_version+0xb4/0x480
> > (XEN) [<ffff82c4801b8215>] set_p2m_entry+0x85/0xc0
> > (XEN) [<ffff82c4801bc92e>] set_shared_p2m_entry+0x1be/0x2f0
> > (XEN) [<ffff82c480121c4c>] xmem_pool_free+0x2c/0x310
> > (XEN) [<ffff82c4801bfaf8>] mem_sharing_share_pages+0xd8/0x3d0
> > (XEN) [<ffff82c4801447da>] __find_next_bit+0x6a/0x70
> > (XEN) [<ffff82c48011c519>] cpumask_raise_softirq+0x89/0 xa0
> > (XEN) [<ffff82c480118351>] csched_vcpu_wake+0x101/0x1b0
> > (XEN) [<ffff82c48014717d>] vcpu_kick+0x1d/0x80
> > (XEN) [<ffff82c4801447da>] __find_next_bit+0x6a/0x70
> > (XEN) [<ffff82c48015a1d8>] get_page+0x28/0xf0
> > (XEN) [<ffff82c48015ed72>] do_update_descriptor+0x1d2/0x210
> > (XEN) [<ffff82c480113d7e>] do_multicall+0x14e/0x340
> > (XEN) [<ffff82c4801e3169>] syscall_enter+0xa9/0xae
> > (XEN)
> > (XEN)
> > (XEN) ****************************************
> > (XEN) Panic on CPU 0:
> > (XEN) Xen BUG at mem_sharing.c:441
> > (XEN) ****************************************
> > (XEN)
> > (XEN) Manual reset required ('noreboot' specified)
> >
> > > Date: Mon, 17 Jan 2011 17:02:02 +0800
> > > Subject: Re: [PATCH] mem_sharing: fix race condition of nominate and unshare
> > > From: juihaochiang@xxxxxxxxx
> > > To: tinnycloud@xxxxxxxxxxx
> > > CC: xen-devel@xxxxxxxxxxxxxxxxxxx; tim.deegan@xxxxxxxxxx
> > >
> > > Hi, tinnycloud:
> > >
> > > Do you have xenpaging tools running properly?
> > > I haven't gone through that one, but it seems you have run out of memory.
> > > When this case happens, mem_sharing will request memory to the
> > > xenpaging daemon, which tends to page out and free some memory.
> > > Otherwise, the allocation would fail.
> > > Is this your scenario?
> > >
> > > Bests,
> > > Jui-Hao
> > >
> > > 2011/1/17 MaoXiaoyun <tinnycloud@xxxxxxxxxxx>:
> > > > Another failure on BUG() in mem_sharing_alloc_page()
> > > >
> > > > memset(&req, 0, sizeof(req));
> > > > if(must_succeed)
> > > > {
> > > > /* We do not support 'must_succeed' any more. External operations
> > > > such
> > > > * as grant table mappings may fail with OOM condition!
> > > > */
> > > > BUG();===================>bug here
> > > > }
> > > > else
> > > > {
> > > > /* All foreign attempts to unshare pages should be handled through
> > > > * 'must_succeed' case. */
> > > > ASSERT(v->domain->domain_id == d->domain_id);
> > > > vcpu_pause_nosync(v);
> > > > req.flags |= MEM_EVENT_FLAG_VCPU_PAUSED;
> > > > }
> > > >
>
> --
> Tim Deegan <Tim.Deegan@xxxxxxxxxx>
> Principal Software Engineer, Xen Platform Team
> Citrix Systems UK Ltd. (Company #02937203, SL9 0BG)
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.