[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Page fault is 4 times faster with XI shadow mechanism


  • To: zhu <vanbas.han@xxxxxxxxx>
  • From: "Robert Phillips" <rsp.vi.xen@xxxxxxxxx>
  • Date: Sat, 1 Jul 2006 14:55:18 -0400
  • Cc: Xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Sat, 01 Jul 2006 11:55:41 -0700
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:cc:mime-version:content-type; b=C5B8BqYFFgo75iD6wM/ZExMiDroTHSDRchULBmyI+47/lTMFJQtLVoY9a+koRNJFT9+B2MbnoI/lgkxJxmA1djCs9+/nQCss7Y6qP5NM34SyfMw9JhMWPAXEcyuXXAN9fFqDJbZCT/MVvS39uUI310MQqKdZFOuZ9HY5mrgAReI=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Hello Han,

I am pleased you approve of the design and implementation of the XI shadow mechanism. And I appreciate the time and care you've taken in reviewing this substantial body of new code.

You asked about performance statistics.  With the current XI patch, we are seeing the following:
  • page faults times for XI are about 4 times faster than non-XI:  10.56 (non-XI)  vs 2.43 (XI) usec
  • sync-all times for XI are about 18% faster: 39.72 (non-XI) vs 33.51 (XI) usec
  • invalidate-page times for XI are about 5 times faster: 22.75 (non-XI) vs 4.00 (XI) usec.
  • we haven't measured gva-to-gpa but I would expect it be about the same.  It's quite simple.
You can easily gather your own statistics.  The XI patch gathers statistics and prints them when you type 'y' from the XEN console.  ('Y' clears the statistics.)  Statistics gathering occurs even when the XI code is disabled in xen/Config.mk.  Of course then it gives you statistics for the non-XI shadow code.

---

In an earlier email you provided a code fix; that is,  "if (c_curr_rw && !_32pae_l3)".  Good catch!
I will incorporate your fix in our code base. As you suggested, should a guest L3 PTE erroneously have its R/W flag set,
then the XI shadow code would propagate the error and set the R/W flag in the shadow L3 PTE. Perhaps the XI code could do a better job of validating guest page table entries but I was reluctant to be more rigorous about checking guest PTEs than real hardware is.

In your latest email, you ask "Do we really need to reserve one snapshot page for each smfn at first and retain it
until the HVM domain is destroyed?"

Well I don't.  I simply pre-allocate a pool of SPTI's.  It can be quite a large pool but certainly not one-SPTI per MFN.  SPTIs are allocated on demand (when a guest page needs to be shadowed) and, when the pool runs low, the LRU SPTs are torn down and their SPTIs recycled. 

Currently I allocate about 5% of system memory for this purpose (this includes the SPT, its snapshot and the backlink pages) and, with that reasonable investment, we get very good performance.  With more study, I'm sure things could be tuned even better.  (I hope I have properly understood your questions.)

-- rsp

On 7/1/06, zhu <vanbas.han@xxxxxxxxx > wrote:
Hi,
After taking some time to dig into your patch about XI Shadow page
table, I have to say it's really a good design and implementation IMHO,
especially the parts about the clear hierarchy for each smfn,decision
table and how to support 32nopae in a rather elegant way. However, I
have several questions to discuss with you.:-)
1) It seems XI shadow pgt reserve all of the possible resources at the
early stage for HVM domain(the first time to create the asi). It could
be quite proper to reserve the smfns and sptis. However, do we really
need to reserve one snapshot page for each smfn at first and retain it
until the HVM domain is destroyed? I guess a large number of gpts will
not been modified frequently after them are totally set up. IMHO, it
would be better to manage these snapshot pages dynamic. Of course, this
will change the basic logistic of the code, e.g. you have to sync the
shadow pgt when invoke spti_make_shadow instead of leaving it out of
sync, you can't set up the total low level shadow pgt when invoke
resync_spte  since it could cost a lot of time.
2) GP back link plays a very important role in XI shadow pgt. However,
it will also cause high memory pressure for the domain(2 pages for each
smfn). For these normal guest pages instead of GPT pages, I guess its
usage is limited. Only when invoke xi_invld_mfn, divide_large_page or
dirty logging, we will refer to the back link for these normal guest
pages. Is it reasonable to implement the back link only for the GPT
pages? Of course, this will increase the complexity of the code a little.
3) Can you show us the statistics between the current shadow pgt and XI
pgt for some critical operations, such as shadow_resync_all, gva_to_gpa,
shadow_fault and so on. I'm really curious about it.

I have to say I'm not very familiar with the current shadow pgt
implementation so I could miss some important considerations when I post
these questions. Please point it out.
Thanks for sharing your idea and code with us. :-)

_______________________________________________________
Best Regards,
hanzhu



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel



--
--------------------------------------------------------------------
Robert S. Phillips                          Virtual Iron Software
rphillips@xxxxxxxxxxxxxxx                Tower 1, Floor 2
978-849-1220                                 900 Chelmsford Street
                                                    Lowell, MA 01851
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.