Xen project Mailing List

Re: [Xen-devel] Consult some concepts about shadow paging mechanism

To: Gianluca Guida <gianluca.guida@xxxxxxxxxxxxx>

From: Jui-Hao Chiang <windtracekimo@xxxxxxxxx>

Date: Sun, 3 May 2009 09:39:26 -0400

Delivery-date: Sun, 03 May 2009 06:41:10 -0700

Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=vSWUT0kdUk63ZhALjgHTJd1THscoKoHc7Eelpgqow2rP52nanYP7gyMnX9QSG7cK9Q tlLOiFj+RQgFbvWEjtPSxsIVUavdBZ5hFDwr0aq55ppiVF3lUQSJf9fis0EFMhRXMcM9 rL16+L0wjZI9bW7TffZTxEPkx8MO0zjKudZvg=

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

I got the answer because I made a mistake to pass four sl2mfn entries in v->arch.paging.shadow.l3table[] to sh_walk_l2_table(). Then truth is I only need to pass v->arch.paging.shadow.l3table[0] because SHADOW_FOREACH_L2E has already done a good job on looping the four sl2mfns. But I have another doubt in traversing SPT from level 3, level 2, and level1. When I am traversing down to the level 1 SPT, I found several inconsistency between gl1e and sl1e content, which is the same as the mechanism in sh_audit_l1_table(). Is this a normal case? I thought they should keep consistent at all times. My purpose is to walk down the SPT and GPT during each process context switch (sh_update_cr3), and do some statistics first, e.g. dirty, access, present bit. Now I tried another checking in level 2 SPT by skipping those sl1mfn which does not pass sh_mfn_is_a_page_table(sl1mfn) check, then the inconsistency is gone is level 1 SPT traversing. Can anyone show some hint about how to do the right thing? Is there some special type of SPTE that I should not traverse down? Many thanks, Jui-Hao On Fri, May 1, 2009 at 10:47 PM, Jui-Hao Chiang <windtracekimo@xxxxxxxxx> wrote: > Hi, sorry for disturbing you guys again. > > Assume guest's paging level is 2 and shadow is using level 3 PAE. > I am now trying to dump the L2 shadow page table information in the > beginning of sh_update_cr3() as the following (actually copying the > code from sh_audit_l2_table and audit_gfn_to_mfn functions) > > The code accidentally crashes in guest_l2e_get_flags(*gl2e) of the > sh_walk_l2_table I wrote. > However, the weird part is the code doesn't crash in gfn = > guest_l2e_get_gfn(*gl2e) which is accessing the *gl2e in a similar way > as guest_l2e_get_flags. > > static inline mfn_t > convert_gfn_to_mfn(struct vcpu *v, gfn_t gfn, mfn_t gmfn) > { > p2m_type_t p2mt; > if ( !shadow_mode_translate(v->domain) ) > return _mfn(gfn_x(gfn)); > > if ( (mfn_to_page(gmfn)->u.inuse.type_info & PGT_type_mask) > != PGT_writable_page ) > return _mfn(gfn_x(gfn)); // This is a paging-disabled shadow > else > return gfn_to_mfn(v->domain, gfn, &p2mt); > } > > /* JuiHao: walk the l2 shadow page table based on input sl2mfn */ > static int sh_walk_l2_table(struct vcpu *v, mfn_t sl2mfn, mfn_t x) > { > guest_l2e_t *gl2e, *gp; > shadow_l2e_t *sl2e; > mfn_t sl1mfn, gl2mfn; > gfn_t gfn; > mfn_t gmfn; > int done = 0; > > /* Follow the backpointer in struct shadow_page_info to get guest > l2mfn */ > gl2mfn = _mfn(mfn_to_shadow_page(sl2mfn)->backpointer); > gl2e = gp = sh_map_domain_page(gl2mfn); > > SHADOW_FOREACH_L2E(sl2mfn, sl2e, &gl2e, done, v->domain, { > > gfn = guest_l2e_get_gfn(*gl2e); // ###!!!! Works Fine > !!!!!#### > sl1mfn = shadow_l2e_get_mfn(*sl2e); > > if (mfn_valid(sl1mfn) && (shadow_l2e_get_flags(*sl2e) & > _PAGE_PRESENT)) { > > // We get this gmfn is just to double check if this is > equal to sl1mfn > gmfn = (guest_l2e_get_flags(*gl2e) & _PAGE_PSE) // > ###!!!! CRASH !!!!!#### > ? get_fl1_shadow_status(v, gfn) > : get_shadow_status(v, convert_gfn_to_mfn(v, > gfn, gl2mfn), > SH_type_l1_shadow); > > if (mfn_x(gmfn) != mfn_x(sl1mfn)) { > printk("!! gmfn %" PRI_mfn " != sl1mfn %" > PRI_mfn "\n", gmfn, sl1mfn); > } else { > printk("going down to traverse level 1 SPT\n"); > } > } > > }); > sh_unmap_domain_page(gp); > return 0; > } > > Could you help a little bit on this? > Many thanks, > Jui-Hao > > On Fri, Apr 24, 2009 at 9:32 AM, Gianluca Guida > <gianluca.guida@xxxxxxxxxxxxx> wrote: >> On Fri, Apr 24, 2009 at 6:23 AM, Jui-Hao Chiang <windtracekimo@xxxxxxxxx> >> wrote: >>> I have some additional doubts as the following: >>> (1) For normal data page, in order to propagate the Dirty or Access >>> bit from SPTE to GPTE, the hypervisor needs to set Read-Only in the >>> SPTE. When the write page fault of this data page comes, hypervisor >>> can propagate the Dirty or Access bit to GPTE and set it to R/W. My >>> question is when does the hypervisor make it Read-Only again? Is there >>> any place inside the source code you can point out? >> >> What happens is this: the guest has to clear the dirty/accessed bit >> and then flush the tlb (or invlpg the entry). >> If the pagetable is mapped read only (as in levels > 1) the write to >> the pagetable will trigger the emulator that will update the entry. >> Otherwhise (if the page is out of sync, which means a writable guest >> pagetable, and this happens when it's an L1) the flushtlb will do the >> job of updating the shadow entry. >> >> Look at how sh_propagate function works and when it get called. It's >> what you're looking for. >> >>> (2) How many shadow pages are maintained for each guest domain? If the >>> hypervisor keep only one shadow page table for the active process in >>> each guest domain, then during the guest context-switch, it might >>> erase the entire shadow page table, and re-construct it for the new >>> process, which seems a lot of overhead. I have checked the >>> sh_update_cr3(), but not sure of the detailed mechanism. >> >> There's a pool of shadow memory that get reused in a pseudo-LRU >> manner. Across cr3 switch toplevel pagetables are kept in memory, and >> unshadowed when evicted by the allocator or when other things happens, >> mostly based on heuristic and reference counting. >> >> Thanks, >> Gianluca >> >> -- >> It was a type of people I did not know, I found them very strange and >> they did not inspire confidence at all. Later I learned that I had been >> introduced to electronic engineers. >> E. W. Dijkstra >> > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.