[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 2/4] xen/arm: introduce a generic p2m walker and use it in p2m_lookup



On Mon, 2013-09-09 at 17:06 +0100, Stefano Stabellini wrote:
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xxxxxxxxxxxxx>
> 
> 
> Changes in v5:
> - align tests;
> - comment p2m_walker;
> - fix return codes in p2m_walker;
> - handle superpages in p2m_walker;
> - rename _p2m_lookup to p2m_lookup_f.
> ---
>  xen/arch/arm/p2m.c |  131 +++++++++++++++++++++++++++++++++++++++++----------
>  1 files changed, 105 insertions(+), 26 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 307c6d4..a9ceacf 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -31,48 +31,127 @@ void p2m_load_VTTBR(struct domain *d)
>  }
>  
>  /*
> - * Lookup the MFN corresponding to a domain's PFN.
> + * d: domain p2m to walk
> + * paddr: the guest start physical address
> + * order: page order
> + * func: function to call for each 2-stage lpae_t entry found

s/2-stage/stage-2/

Also it is called for each *leaf* entry, not for all of them. It could
be argued that either was a useful interface in general.

This function is not actually all the p2m specific in the end, by using
lpae_t.walk instead of lpae_t.pt (see dump_pt_walk) you could make it
useful for e.g. hyp pagetable walks too. (e.g. dump_hyp_walk). In theory
it could also be used for LPAE guest walks too, but we'd need a separate
walker for non-LPAE guests.

If we wanted to replace dump_hyp_walk with this then calling the
callback for each level would be required.

> + * arg: opaque pointer to pass to func
>   *
> - * There are no processor functions to do a stage 2 only lookup therefore we
> - * do a a software walk.
>   */
> -paddr_t p2m_lookup(struct domain *d, paddr_t paddr)
> +static int p2m_walker(struct domain *d, paddr_t paddr, unsigned int order,
> +                      int (*func)(lpae_t *pte, void *arg, int level), void* 
> arg)
>  {
> +    lpae_t *first = NULL, *second = NULL, *third = NULL;
>      struct p2m_domain *p2m = &d->arch.p2m;
> -    lpae_t pte, *first = NULL, *second = NULL, *third = NULL;
> -    paddr_t maddr = INVALID_PADDR;
> +    int rc = -EFAULT, level = 1;
> +    unsigned long cur_first_offset = ~0, cur_second_offset = ~0;
> +    paddr_t pend = paddr + (1UL << order);

1UL is 32 bits on arm32, but paddr_t is 64 bits, so this might fail for
large enough order?

I think you need something like: (((paddr_t)1UL)<<order). Or maybe a
helpful macro.

>  
>      spin_lock(&p2m->lock);
>  
>      first = __map_domain_page(p2m->first_level);
>  
> -    pte = first[first_table_offset(paddr)];
> -    if ( !pte.p2m.valid || !pte.p2m.table )
> -        goto done;
> +    if ( !first ||
> +         !first[first_table_offset(paddr)].p2m.valid )
> +        goto err;

Why is the first level handled specially outside the loop? What happens
if order is such that we span multiple first level entries?

> -    second = map_domain_page(pte.p2m.base);
> -    pte = second[second_table_offset(paddr)];
> -    if ( !pte.p2m.valid || !pte.p2m.table )
> -        goto done;
> +    if ( !first[first_table_offset(paddr)].p2m.table )
> +    {
> +        rc = func(&first[first_table_offset(paddr)], arg, level);
> +        if ( rc != 0 )
> +            goto err;
> +        paddr += FIRST_SIZE;
> +    }
>  
> -    third = map_domain_page(pte.p2m.base);
> -    pte = third[third_table_offset(paddr)];
> +    while ( paddr < pend )
> +    {
> +        rc = -EFAULT;
> +        level = 1;

I think this and all the level++ stuff could be replaced by literal 1,
2, and 3 at the appropriate callback sites. The level is statically
implied by the code.

>  
> -    /* This bit must be one in the level 3 entry */
> -    if ( !pte.p2m.table )
> -        pte.bits = 0;
> +        if ( cur_first_offset != first_table_offset(paddr) )
> +        {
> +            if (second) unmap_domain_page(second);
> +            second = 
> map_domain_page(first[first_table_offset(paddr)].p2m.base);
> +            cur_first_offset = first_table_offset(paddr);
> +        }
> +        level++;
> +        if ( !second ||

ASSERT(second) seems reasonable here, I think, since it would indicate
we had screwed up the p2m pretty badly. That would...

> +             !second[second_table_offset(paddr)].p2m.valid )
> +            goto err;

... leave the -EFAULT resulting from this for a failure of the actual
walk itself.

> +        if ( !second[second_table_offset(paddr)].p2m.table )
> +        {
> +            rc = func(&first[first_table_offset(paddr)], arg, level);
> +            if ( rc != 0 )
> +                goto err;
> +            paddr += SECOND_SIZE;

You need to continue here, don't you? Otherwise you will look into the
3rd level for the paddr +SECOND_SIZE address before you should have
done.

Likewise for the 1st level stuff once it is properly within the loop (I
wonder if this is what caused you to pull that out?)

> +        }
>  
> -done:
> -    if ( pte.p2m.valid )
> -        maddr = (pte.bits & PADDR_MASK & PAGE_MASK) | (paddr & ~PAGE_MASK);
> +        if ( cur_second_offset != second_table_offset(paddr) )
> +        {
> +            if (third) unmap_domain_page(third);
> +            third = 
> map_domain_page(second[second_table_offset(paddr)].p2m.base);
> +            cur_second_offset = second_table_offset(paddr);
> +        }
> +        level++;
> +        if ( !third ||
> +             !third[third_table_offset(paddr)].p2m.valid )
> +            goto err;

If third...table is 0 then that is an error too, iff valid == 1.

>  
> -    if (third) unmap_domain_page(third);
> -    if (second) unmap_domain_page(second);
> -    if (first) unmap_domain_page(first);
> +        rc = func(&third[third_table_offset(paddr)], arg, level);
> +        if ( rc != 0 )
> +            goto err;
> +
> +        paddr += PAGE_SIZE;
> +    }
> +
> +    rc = 0;
> +
> +err:
> +    if ( third ) unmap_domain_page(third);
> +    if ( second ) unmap_domain_page(second);
> +    if ( first ) unmap_domain_page(first);
>  
>      spin_unlock(&p2m->lock);
>  
> -    return maddr;
> +    return rc;
> +}
> +
> +struct p2m_lookup_t {
> +    paddr_t paddr;
> +    paddr_t maddr;
> +};
> +
> +static int p2m_lookup_f(lpae_t *ptep, void *arg, int level)
> +{
> +    lpae_t pte;
> +    struct p2m_lookup_t *p2m = (struct p2m_lookup_t *)arg;
> +    ASSERT(level == 3);
> +
> +    pte = *ptep;
> +
> +    /* This bit must be one in the level 3 entry */
> +    if ( !pte.p2m.table || !pte.p2m.valid )
> +        return -EFAULT;

Ah, here's that check I was talking about early -- you could make this
generic I think.

> +
> +    p2m->maddr = (pte.bits & PADDR_MASK & PAGE_MASK) |
> +        (p2m->paddr & ~PAGE_MASK);
> +    return 0;
> +}
> +/*
> + * Lookup the MFN corresponding to a domain's PFN.
> + *
> + * There are no processor functions to do a stage 2 only lookup therefore we
> + * do a a software walk.
> + */
> +paddr_t p2m_lookup(struct domain *d, paddr_t paddr)
> +{
> +    struct p2m_lookup_t p2m;
> +    p2m.paddr = paddr;
> +    p2m.maddr = INVALID_PADDR;
> +
> +    p2m_walker(d, paddr, 0, p2m_lookup_f, &p2m);
> +
> +    return p2m.maddr;
>  }
>  
>  int guest_physmap_mark_populate_on_demand(struct domain *d,



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.