[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH][Xen 4.0-testing.hg] fix small bugs of memory sharing


  • To: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
  • From: Jui-Hao Chiang <juihaochiang@xxxxxxxxx>
  • Date: Thu, 9 Dec 2010 16:14:43 +0800
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>, Grzegorz Milos <grzegorz.milos@xxxxxxxxx>
  • Delivery-date: Thu, 09 Dec 2010 00:15:55 -0800
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; b=iMDkiB5mbDjtjK7xpGvJkx2O5DRimneDMOTXkNUZBTYqDkBaxiNw0gEqWz7JZuICdM +5R1osJFoVBZtPldH//Z/MzCdNZBuQGbY7qv0paPqI/WMJz74SbuSJb4lGtDIAqgt7rD 3eCmPJ6JTrQ0v9w940Rf9o511YVmXOxrLojgw=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Hi, Tim:

Thanks for your information,
First I want to explain that our current project is try to implement memory deduplication for unmodified guest in Xen (basically HVM).
Since memory sharing code provides good fundamental for COW mechanism, we would like to test and utilize it.
Please see my inline comments.

On Wed, Dec 8, 2010 at 6:51 PM, Tim Deegan <Tim.Deegan@xxxxxxxxxx> wrote:
Hi,

At 02:56 +0000 on 03 Dec (1291344990), Jui-Hao Chiang wrote:
> This small patch fixes 2 problems of memory sharing for xen-4.0-testing.hg
> (I haven't submitted patch here, if it violates any conventional
> rules, I'm glad to have advices)

Thanks for your patch!

Patches should be based on the tip of xen-unstable; we apply them there
and backport to the stable branches.

You are right, I should change to xen-unstable when submitting the patch.
But in the latest xen-unstable, the mem_sharing_share_pages() function crashes the entire xen.
Currently I don't have a serial-port for debugging the oops message. Or could someone give me a hint on how to debug this kind of crash?
 

Also, you need to add a "Signed-off-by" line to the patch description to
declare that the code is appropriately owned/licensed.
See: http://elinux.org/Developer_Certificate_Of_Origin for what that means.

Got it.

> (1) When nominating a shared page, the page_make_sharable() does not
>     recover the type_info count if it fails to nominate the page.

It looks to me as if it works already -- the cmpxchg loop in that
function always changes from (type = none, count = 0) to (type = shared,
count = 1), so the put_page_and_type() in the failure case does the
right thing, putting the count back to 0.


It seems the candidate page for nomination usually has (type=none, count=1), and it's ok for page_make_sharable() to make it (type=none, count=2) afterwards.
However, when we have a page (type=none, count=2), the page_make_sharable() will make it wrong as the following steps:
(step1) get_page() increases count (type=none, count=3)
(step2) cmpxchg loops changes type (type=8400000000000001, count=3, actually the real value of count_info is 0x8000000000000002)
(step3) Checking count is greater than 2? Oops!.... abort without recovering type back to none

So that's why I interchange (step2) and (step3) and replace put_page_and_type() with put_page().

I don't understand why this function requires type == none; CC'ing the
author for an explanation.

> (2) When building xen with debug=n, the code in ASSERT() won't get
>     executed. Change to BUG_ON.

This part is clearly correct; I've made the equivalent change in
xen-unstable as changeset 22467:89116f28083f

> Besides, I don't understand why the page_make_sharable() force checking the count_info with the following way?
> /* Check if the ref count is 2. The first from PGT_allocated, and the second
>      * from get_page at the top of this function */
>     if(page->count_info != (PGC_allocated | (2 + expected_refcnt)))
>
> This seems to imply that the following kind of page can never be nominated for shared pages because ci (count_info) is greater than 2 after get_page. Here, domain 3 is a 64-bit HVM with hap=1, pae=1 on 64bit Xen.
> (XEN) Debug for domain=3, gfn=10, Debug page: MFN=c210ad is ci=8000000000000002, ti=0, owner_id=3
>
> Can someone gives a hint that
> (1) in what kind of scenario that ci = 2 and ti=0?
> (2) or why not allow ci >=2 to be nominated?

count = 2 and type = 0 happens in exactly the situation that the comment
describes: the page has no mappings from anywhere, just the one refcount
from being allocated and one taken at the start of the current function.


(Correct me if I am wrong please !!)
From my observation, the normal page mapped by a single gfn of a domain will have count=1 (PGT_allocated) and type=0.
The page_make_sharable() will use get_page() before checking ci >=2.
So it's ok for PGT_allocated page, but not ok if ci =2 before get_page().

After some tracing, I found a scenario for the page (ci=2, ti=0).
It seems the stub domain for a HVM guest will try to map some of HVM's memory into its own address space using do_mmu_update(), which increases the ci from 1 to 2 without changing the type or marking it as shared. For a 1GB 64-bit HVM Centos 5.5 guest (pae=1, hap=1), around 250MB will become ci=2 after booting into user space prompt.

I wonder the following two things
(1) stub domain does this to perform I/O for HVM guest? can someone point out where this code is?
(2) is there a way or any place to unmap the memory and make the page count back to 1?

It's not possible to share a page with typecount > 0 because we need to
change its type.  I'm not sure why the refcount can't be greater than
two though, but I think it's to do with how shared pages have their
refcounts tracked differently to other pages.  Again, maybe Grzegorz can
clarify.

Assume my previous guess for stub domain is right.
Then if a page from the previous scenario is made sharable and its mapped mfn is freed (when sharing two pages, the later one's mfn will be discarded), will the stub domain refer to the old discarded mfn if no unmapping is performed?


Cheers,

Tim.

--
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, Xen Platform Team
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

Appreciate any comments,
Jui-Hao
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.