[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Consult some concepts about shadow paging mechanism


  • To: Gianluca Guida <gianluca.guida@xxxxxxxxxxxxx>
  • From: Jui-Hao Chiang <windtracekimo@xxxxxxxxx>
  • Date: Fri, 1 May 2009 22:47:51 -0400
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxx
  • Delivery-date: Fri, 01 May 2009 19:48:20 -0700
  • Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:content-transfer-encoding; b=FazsuALiH4eboWDY75yG69TkatVfDpUZyumafZvKEhEvnDU5FTdocSewIsVOyEWtRU Df1ZQxz/038OGP0eSrUO31ptf1BP0fm2PGvP4Dsv3SL6HqVWHe49ETUC6pFxPVpUldAJ 7BWyerJsEoJjMAd8y2RUJVao7M7fOFt6IfRsI=
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Hi, sorry for disturbing you guys again.

Assume guest's paging level is 2 and shadow is using level 3 PAE.
I am now trying to dump the L2 shadow page table information in the
beginning of sh_update_cr3() as the following (actually copying the
code from sh_audit_l2_table and audit_gfn_to_mfn functions)

The code accidentally crashes in  guest_l2e_get_flags(*gl2e) of the
sh_walk_l2_table I wrote.
However, the weird part is the code doesn't crash in gfn =
guest_l2e_get_gfn(*gl2e) which is accessing the *gl2e in a similar way
as guest_l2e_get_flags.

static inline mfn_t
convert_gfn_to_mfn(struct vcpu *v, gfn_t gfn, mfn_t gmfn)
{
    p2m_type_t p2mt;
    if ( !shadow_mode_translate(v->domain) )
        return _mfn(gfn_x(gfn));

    if ( (mfn_to_page(gmfn)->u.inuse.type_info & PGT_type_mask)
         != PGT_writable_page )
        return _mfn(gfn_x(gfn)); // This is a paging-disabled shadow
    else
        return gfn_to_mfn(v->domain, gfn, &p2mt);
}

/* JuiHao: walk the l2 shadow page table based on input sl2mfn */
static int sh_walk_l2_table(struct vcpu *v, mfn_t sl2mfn, mfn_t x)
{
        guest_l2e_t *gl2e, *gp;
        shadow_l2e_t *sl2e;
        mfn_t sl1mfn, gl2mfn;
        gfn_t gfn;
        mfn_t gmfn;
        int done = 0;

        /* Follow the backpointer in struct shadow_page_info to get guest l2mfn 
*/
        gl2mfn = _mfn(mfn_to_shadow_page(sl2mfn)->backpointer);
        gl2e = gp = sh_map_domain_page(gl2mfn);

        SHADOW_FOREACH_L2E(sl2mfn, sl2e, &gl2e, done, v->domain, {

                gfn = guest_l2e_get_gfn(*gl2e);  // ###!!!! Works Fine !!!!!####
                sl1mfn = shadow_l2e_get_mfn(*sl2e);
                
                if (mfn_valid(sl1mfn) && (shadow_l2e_get_flags(*sl2e) & 
_PAGE_PRESENT)) {

                        // We get this gmfn is just to double check if this is 
equal to sl1mfn
                        gmfn = (guest_l2e_get_flags(*gl2e) & _PAGE_PSE) // 
###!!!! CRASH !!!!!####
                                ? get_fl1_shadow_status(v, gfn)
                                : get_shadow_status(v, convert_gfn_to_mfn(v, 
gfn, gl2mfn),
                                SH_type_l1_shadow);
                        
                        if (mfn_x(gmfn) != mfn_x(sl1mfn)) {
                                printk("!! gmfn %" PRI_mfn " != sl1mfn %" 
PRI_mfn "\n", gmfn, sl1mfn);
                        } else {
                                printk("going down to traverse level 1 SPT\n");
                        }
                }

        });
        sh_unmap_domain_page(gp);
        return 0;
}

Could you help a little bit on this?
Many thanks,
Jui-Hao

On Fri, Apr 24, 2009 at 9:32 AM, Gianluca Guida
<gianluca.guida@xxxxxxxxxxxxx> wrote:
> On Fri, Apr 24, 2009 at 6:23 AM, Jui-Hao Chiang <windtracekimo@xxxxxxxxx> 
> wrote:
>> I have some additional doubts as the following:
>> (1) For normal data page, in order to propagate the Dirty or Access
>> bit from SPTE to GPTE, the hypervisor needs to set Read-Only in the
>> SPTE. When the write page fault of this data page comes, hypervisor
>> can propagate the Dirty or Access bit to GPTE and set it to R/W. My
>> question is when does the hypervisor make it Read-Only again? Is there
>> any place inside the source code you can point out?
>
> What happens is this: the guest has to clear the dirty/accessed bit
> and then flush the tlb (or invlpg the entry).
> If the pagetable is mapped read only (as in levels > 1) the write to
> the pagetable will trigger the emulator that will update the entry.
> Otherwhise (if the page is out of sync, which means a writable guest
> pagetable, and this happens when it's an L1) the flushtlb will do the
> job of updating the shadow entry.
>
> Look at how sh_propagate function works and when it get called. It's
> what you're looking for.
>
>> (2) How many shadow pages are maintained for each guest domain? If the
>> hypervisor keep only one shadow page table for the active process in
>> each guest domain, then during the guest context-switch, it might
>> erase the entire shadow page table, and re-construct it for the new
>> process, which seems a lot of overhead. I have checked the
>> sh_update_cr3(), but not sure of the detailed mechanism.
>
> There's a pool of shadow memory that get reused in a pseudo-LRU
> manner. Across cr3 switch toplevel pagetables are kept in memory, and
> unshadowed when evicted by the allocator or when other things happens,
> mostly based on heuristic and reference counting.
>
> Thanks,
> Gianluca
>
> --
> It was a type of people I did not know, I found them very strange and
> they did not inspire confidence at all. Later I learned that I had been
> introduced to electronic engineers.
>                                                  E. W. Dijkstra
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.