[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] RE: [PATCH] mem_sharing: fix race condition of nominate and unshare


  • To: MaoXiaoyun <tinnycloud@xxxxxxxxxxx>
  • From: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
  • Date: Wed, 19 Jan 2011 09:11:37 +0000
  • Cc: xen devel <xen-devel@xxxxxxxxxxxxxxxxxxx>, tim.deegan@xxxxxxxxxx, juihaochiang@xxxxxxxxx
  • Delivery-date: Wed, 19 Jan 2011 01:12:37 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=uSaSPfjWoLn9ES8+9yibGd+RfiCIqT3Tyqg7sVX8TTG71ZFdWsSAKa1A5fPlswjV+L jOpSN9kibaiI3KoH9kbWiEiepeWZYx563icfBs6Tv+ZQ1y+dG6RfDHaJb+4qyh9PRLLf NNd7hrFzZ9PBQh9y8m6TnK3KBT+NWWdoWaczc=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Very likely.  If you look in xen/arch/x86/mm/p2m.c, the two functions
which check a page to see if it can be reclaimed are
"p2m_pod_zero_check*()".  A little ways into each function there's a
giant "if()" which has all of the conditions for reclaiming a page,
starting with p2m_is_ram().  The easiest way to fix it is to add
p2m_is_shared() to that "if" statement.

 -George

2011/1/19 MaoXiaoyun <tinnycloud@xxxxxxxxxxx>:
> Hi George:
>
>        I am working on the xen mem_sharing,  I think the bug below is
> related to POD.
> (Test shows when POD is enable, it is easily hit the bug, when disabled, no
> bug occurs).
>
> As I know when domU starts will POD, it gets memory from POD cache, and in
> some
> situation, POD cached will scan from Zero pages for reusing(link the page
> into POD
> cache page list), and from the page_info define, list and handle share same
> posistion,
> I think when reusing the page, POD doest't check page type, and if it is a
> shared page
> , it still can be put into POD cache, and thus handle is been overwritten.
>
>       So maybe we need to check the page type before putting into cache,
> What's your opinion?
>       thanks.
>
>>--------------------------------------------------------------------------------
>>From: tinnycloud@xxxxxxxxxxx
>>To: juihaochiang@xxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>>CC: tim.deegan@xxxxxxxxxx
>>Subject: RE: [PATCH] mem_sharing: fix race condition of nominate and
>> unshare
>>Date: Tue, 18 Jan 2011 20:05:16 +0800
>>
>>Hi:
>>
>> It is later found that caused by below patch code and I am using the
>> blktap2.
>>The handle retruned from here will later become ch in
>> mem_sharing_share_pages, and then
>>in mem_sharing_share_pages will have ch = sh, thus caused the problem.
>>
>>+    /* Return the handle if the page is already shared */
>>+    page = mfn_to_page(mfn);
>>+    if (p2m_is_shared(p2mt)) {
>>+        *phandle = page->shr_handle;
>>+        ret = 0;
>>+        goto out;
>>+    }
>>+
>>
>>But. after I  removed the code, tests still failed, and this handle's value
>> is not make sence.
>>
>>
>>(XEN) ===>total handles 17834 total gfns 255853
>>(XEN) handle 13856642536914634
>>(XEN) Debug for domain=1, gfn=19fed, Debug page: MFN=349c0a is
>> ci=8000000000000008, ti=8400000000000007, owner_id=32755
>>(XEN) ----[ Xen-4.0.0  x86_64  debug=n  Not tainted ]----
>>(XEN) CPU:    15
>>(XEN) RIP:    e008:[<ffff82c4801bff4b>]
>> mem_sharing_unshare_page+0x19b/0x720
>>(XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor
>>(XEN) rax: 0000000000000000   rbx: ffff83063fc67f28&nbsp ;  rcx:
>> 0000000000000092
>>(XEN) rdx: 000000000000000a   rsi: 000000000000000a   rdi: ffff82c48021e9c4
>>(XEN) rbp: ffff830440000000   rsp: ffff83063fc67c48   r8:  0000000000000001
>>(XEN) r9:  0000000000000000   r10: 00000000fffffff8   r11: 0000000000000005
>>(XEN) r12: 0000000000019fed   r13: 0000000000000000   r14: 0000000000000000
>>(XEN) r15: ffff82f606938140   cr0: 000000008005003b   cr4: 00000000000026f0
>>(XEN) cr3: 000000055513c000   cr2: 0000000000000018
>>(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
>>(XEN) Xen stack trace from rsp=ffff83063fc67c48:
>>(XEN)    02c5f6c8b70fed66 39ef64058b487674 ffff82c4801a6082
>> 0000000000000000
>>(XEN)    00313a8b00313eca 0000000000000001 0000000000000009 ff
>> ff830440000000
>>(XEN)    ffff83063fc67cb8 ffff82c4801df6f9 0000000000000040
>> ffff83063fc67d04
>>(XEN)    0000000000019fed 0000000d000001ed ffff83055458d000
>> ffff83063fc67f28
>>(XEN)    0000000000019fed 0000000000349c0a 0000000000000030
>> ffff83063fc67f28
>>(XEN)    0000000000000030 ffff82c48019baa6 ffff82c4802519c0
>> 0000000d8016838e
>>(XEN)    0000000000000000 00000000000001aa ffff8300bf554000
>> ffff82c4801b3864
>>(XEN)    ffff830440000348 ffff8300bf554000 ffff8300bf5557f0
>> ffff8300bf5557e8
>>(XEN)    00000032027b81f2 ffff82c48026f080 ffff82c4801a9337
>> ffff8300bf448000
>>(XEN)    ffff8300bf554000 ffff830000000000 0000000019fed000
>> ffff8300bf2f2000
>>(XEN)    ffff82c48019985d 0000000000000080 ffff8300bf554000
>> 0000000000019fed
>>(XEN)    ffff82c4801b08ba 000000000001e000 ffff82c48014931f ff
>> ff8305570c6d50
>>(XEN)    ffff82c480251080 00000032027b81f2 ffff8305570c6d50
>> ffff83052f3e2200
>>(XEN)    0000000f027b7de0 ffff82c48011e07a 000000000000000f
>> ffff82c48026f0a0
>>(XEN)    0000000000000082 0000000000000000 0000000000000000
>> 0000000000009e44
>>(XEN)    ffff8300bf554000 ffff8300bf2f2000 ffff82c48011e07a
>> 000000000000000f
>>(XEN)    ffff8300bf555760 0000000000000292 ffff82c48011afca
>> 00000032028a8fc0
>>(XEN)    0000000000000292 ffff82c4801a93c3 00000000000000ef
>> ffff8300bf554000
>>(XEN)    ffff8300bf554000 ffff8300bf5557e8 ffff82c4801a6082
>> ffff8300bf554000
>>(XEN)    0000000000000000 ffff82c4801a0cc8 ffff8300bf554000
>> ffff8300bf554000
>>(XEN) Xen call trace:
>>(XEN)    [<ffff82c4801bff4b>] mem_sharing_unshare_page+0x19b/0x720
>>(XEN)    [<ffff82c4801a6082>] v lapic_has_pending_irq+0x42/0x70
>>(XEN)    [<ffff82c4801df6f9>] ept_get_entry+0xa9/0x1c0
>>(XEN)    [<ffff82c48019baa6>] hvm_hap_nested_page_fault+0xd6/0x190
>>(XEN)    [<ffff82c4801b3864>] vmx_vmexit_handler+0x304/0x1a90
>>(XEN)    [<ffff82c4801a9337>] pt_restore_timer+0x57/0xb0
>>(XEN)    [<ffff82c48019985d>] hvm_do_resume+0x1d/0x130
>>(XEN)    [<ffff82c4801b08ba>] vmx_do_resume+0x11a/0x1c0
>>(XEN)    [<ffff82c48014931f>] context_switch+0x76f/0xf00
>>(XEN)    [<ffff82c48011e07a>] add_entry+0x3a/0xb0
>>(XEN)    [<ffff82c48011e07a>] add_entry+0x3a/0xb0
>>(XEN)    [<ffff82c48011afca>] schedule+0x1ea/0x500
>>(XEN)    [<ffff82c4801a93c3>] pt_update_irq+0x33/0x1e0
>>(XEN)    [&lt ;ffff82c4801a6082>] vlapic_has_pending_irq+0x42/0x70
>>(XEN)    [<ffff82c4801a0cc8>] hvm_vcpu_has_pending_irq+0x88/0xa0
>>(XEN)    [<ffff82c4801b267b>] vmx_vmenter_helper+0x5b/0x150
>>(XEN)    [<ffff82c4801adaa3>] vmx_asm_do_vmentry+0x0/0xdd
>>(XEN)
>>(XEN) Pagetable walk from 0000000000000018:
>>(XEN)  L4[0x000] = 0000000000000000 ffffffffffffffff
>>(XEN)
>>(XEN) ****************************************
>>(XEN) Panic on CPU 15:
>>(XEN) FATAL PAGE FAULT
>>(XEN) [error_code=0000]
>>(XEN) Faulting linear address: 0000000000000018
>>(XEN) ****************************************
>>(XEN)
>>(XEN) Manual reset required ('noreboot' specified)
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------------------------------------
>>>From: tinnycloud@xxxxxxxxxxx
>>>To: juihaochiang@xxxxxxxxx; xen-devel@xxxxxxxxxxxxxxxxxxx
>>>CC: tim.deegan@xxxxxxxxxx
>>>Subject: RE: [PATCH] mem_sharing: fix race condition of nominate and
>>> unshare
>>>Date: Tue, 18 Jan 2011 17:42:32 +0800
>>
>>>Hi Tim & Jui-Hao:
>>
>> >     When I use Linux HVM instead of Windows HVM, more bug shows up.
>>
>>>      I only start on VM, and when I destroy it , xen crashed on
>>> mem_sharing_unshare_page()
>>>which in line709, hash_entry is NULL. Later I found the handle has been
>>> removed in
>>>mem_sharing_share_pages(), please refer logs below.
>>
>
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel
>
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.