[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 10/10] xen: arm: Use a direct mapping of RAM on arm64



On Fri, 2013-06-21 at 15:10 +0100, Stefano Stabellini wrote:
> On Tue, 18 Jun 2013, Ian Campbell wrote:
> > We have plenty of virtual address space so we can avoid needing to map and
> > unmap pages all the time.
> > 
> > A totally arbitrarily chosen 32GB frame table leads to support for 5TB of 
> > RAM.
> > I haven't tested with anything near that amount of RAM though. There is 
> > plenty
> > of room to expand further when that becomes necessary.
> > 
> > Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
> > ---
> >  xen/arch/arm/mm.c            |  155 
> > ++++++++++++++++++++++++++++++++++--------
> >  xen/arch/arm/setup.c         |   92 +++++++++++++++++++++++++
> >  xen/include/asm-arm/config.h |   96 +++++++++++++++++++-------
> >  xen/include/asm-arm/mm.h     |   16 ++++
> >  4 files changed, 307 insertions(+), 52 deletions(-)
> > 
> > diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> > index 5ad687f..35ef56d 100644
> > --- a/xen/arch/arm/mm.c
> > +++ b/xen/arch/arm/mm.c
> > @@ -37,6 +37,7 @@
> >  #include <public/memory.h>
> >  #include <xen/sched.h>
> >  #include <xen/vmap.h>
> > +#include <asm/early_printk.h>
> >  #include <xsm/xsm.h>
> >  #include <xen/pfn.h>
> >  
> > @@ -48,6 +49,14 @@ struct domain *dom_xen, *dom_io, *dom_cow;
> >  lpae_t boot_pgtable[LPAE_ENTRIES] __attribute__((__aligned__(4096)));
> >  #ifdef CONFIG_ARM_64
> >  lpae_t boot_first[LPAE_ENTRIES] __attribute__((__aligned__(4096)));
> > +/* The first page of the first level mapping of the xenheap. The
> > + * subsequent xenheap first level pages are dynamically allocated, but
> > + * we need this one to bootstrap ourselves. */
> 
> Why? Can't we dynamically allocate the first page of the first level
> mapping too?

We need it to create the mappings used by the boot allocator, IOW we
need something to bootstrap ourselves. It's a bit hoop jumpy but I think
it is needed.

> > @@ -180,6 +201,7 @@ void clear_fixmap(unsigned map)
> >      flush_xen_data_tlb_range_va(FIXMAP_ADDR(map), PAGE_SIZE);
> >  }
> >  
> > +#ifdef CONFIG_DOMAIN_PAGE
> >  void *map_domain_page_global(unsigned long mfn)
> >  {
> >      return vmap(&mfn, 1);
> >
> > @@ -284,6 +306,7 @@ unsigned long domain_page_map_to_mfn(const void *va)
> >  
> >      return map[slot].pt.base + offset;
> >  }
> > +#endif
>   
> these are exported functions, shouldn't we implement another version of
> them for the non-CONFIG_DOMAIN_PAGE case?

In the !CONFIG_DOMAIN_PAGE case these come from the common code in
xen/include/xen/domain_page.h, so we don't need to provide our own.

> 
> 
> >  void __init arch_init_memory(void)
> >  {
> > @@ -431,6 +454,7 @@ void __init setup_pagetables(unsigned long 
> > boot_phys_offset, paddr_t xen_paddr)
> >      /* Flush everything after setting WXN bit. */
> >      flush_xen_text_tlb();
> >  
> > +#ifdef CONFIG_ARM_32
> >      per_cpu(xen_pgtable, 0) = boot_pgtable;
> >      per_cpu(xen_dommap, 0) = xen_second +
> >          second_linear_offset(DOMHEAP_VIRT_START);
> > @@ -440,43 +464,31 @@ void __init setup_pagetables(unsigned long 
> > boot_phys_offset, paddr_t xen_paddr)
> >      memset(this_cpu(xen_dommap), 0, DOMHEAP_SECOND_PAGES*PAGE_SIZE);
> >      flush_xen_dcache_va_range(this_cpu(xen_dommap),
> >                                DOMHEAP_SECOND_PAGES*PAGE_SIZE);
> > +#endif
> >  }
> >  
> >  int init_secondary_pagetables(int cpu)
> >  {
> > -    lpae_t *root, *first, *domheap, pte;
> > -    int i;
> > -
> > -    root = alloc_xenheap_page();
> >  #ifdef CONFIG_ARM_64
> > -    first = alloc_xenheap_page();
> > +    /* All CPUs share a single page table on 64 bit */
> > +    return 0;
> 
> I would just ifdef the entire function

I expect you don't mean for me to also ifdef the call? I preferred to
write the prototype only once so the two didn't get out of sync in the
future, which can be an issue with the pattern of stubbing out functions
with a separate static inline. I can change that if you have a strong
preference though.

> > +void __init setup_xenheap_mappings(unsigned long base_mfn,
> > +                                   unsigned long nr_mfns)
> > +{
> > +    lpae_t *first, pte;
> > +    unsigned long offset, end_mfn;
> > +    vaddr_t vaddr;
> > +
> > +    /* First call sets the xenheap physical offset. */
> > +    if ( xenheap_mfn_start == ~0UL )
> > +        xenheap_mfn_start = base_mfn;
> > +
> > +    if ( base_mfn < xenheap_mfn_start )
> > +        early_panic("cannot add xenheap mapping at %lx below heap start 
> > %lx\n",
> > +                    base_mfn, xenheap_mfn_start);
> > +
> > +    end_mfn = base_mfn + nr_mfns;
> > +
> > +    /* Align to previous 1GB boundary */
> > +    base_mfn &= ~FIRST_MASK;
> > +
> > +    offset = base_mfn - xenheap_mfn_start;
> > +    vaddr = DIRECTMAP_VIRT_START + offset*PAGE_SIZE;
> > +
> > +    while ( base_mfn < end_mfn )
> > +    {
> > +        int slot = zeroeth_table_offset(vaddr);
> > +        lpae_t *p = &boot_pgtable[slot];
> > +
> > +        if ( p->pt.valid )
> > +        {
> > +            /* mfn_to_virt is not valid on the 1st 1st mfn, since it
> > +             * is not within the xenheap. */
> > +            first = slot == xenheap_first_first_slot ?
> > +                xenheap_first_first : mfn_to_virt(p->pt.base);
> > +        }
> > +        else if ( xenheap_first_first_slot == -1)
> > +        {
> > +            /* Use xenheap_first_first to bootstrap the mappings */
> > +            first = xenheap_first_first;
> > +
> > +            pte = pte_of_xenaddr((vaddr_t)xenheap_first_first);
> > +            pte.pt.table = 1;
> > +            write_pte(p, pte);
> > +
> > +            xenheap_first_first_slot = slot;
> 
> I take that at this point alloc_boot_pages wouldn't work?

Correct, we haven't mapped any memory for it to use yet.

> > +        }
> > +        else
> > +        {
> > +            unsigned long first_mfn = alloc_boot_pages(1, 1);
> > +            pte = mfn_to_xen_entry(first_mfn);
> > +            pte.pt.table = 1;
> > +            write_pte(p, pte);
> > +            first = mfn_to_virt(first_mfn);
> > +        }
> > +
> > +        pte = mfn_to_xen_entry(base_mfn);
> > +        //pte.pt.contig = /* XXX need to think about the logic */
> 
> Two different comment styles in the same line.

Yes, I use // for "temporary" comments, I should have just removed that
line I think.

> Setting the contig bit here means that we assume that it's likely that
> the first 16 adjacent entries in this table point to a contiguous output
> address range. I think that should be the case, unless the memory banks
> are noncontiguous.

Since 16 contiguous entries at this levels means multiples of 16GB of
RAM I think we do need to handle the non-contig case properly. I'm not
sure what happens if e.g. you have 8GB of RAM and set the contig bit on
those 8 entries, but don't fill in the other 8 at all, I suspect you get
either undefined or implementation defined behaviour. IOW more thought
is needed.

> > +#ifdef CONFIG_ARM_64
> > +    nr_second = frametable_size >> SECOND_SHIFT;
> > +    second_base = alloc_boot_pages(nr_second, 1);
> > +    second = mfn_to_virt(second_base);
> > +    for ( i = 0; i < nr_second; i++ )
> > +    {
> > +        pte = mfn_to_xen_entry(second_base + i);
> > +        pte.pt.table = 1;
> > +        
> > write_pte(&boot_first[first_table_offset(FRAMETABLE_VIRT_START)+i], pte);
> > +    }
> > +    create_32mb_mappings(second, 0, base_mfn, frametable_size >> 
> > PAGE_SHIFT);
> > +#else
> >      create_32mb_mappings(xen_second, FRAMETABLE_VIRT_START, base_mfn, 
> > frametable_size >> PAGE_SHIFT);
> > +#endif
> 
> Is it possible to unify the arm32 and arm64 code paths here?

I tried to unify them as much as I could.

> Being able to handle a frametable that spans multiple second level
> pagetable pages shouldn't be a problem for arm32.

That's not the issue. The issue is that the frame table lives at
different addresses in both cases, and in particular 64-bit has its own
first level pages with dedicated second level pages for this region
while 32-bit has a small number of second level entries under the same
first level entry as various other bits of Xen (including .text, vmap,
ioremap region etc etc).

> > +        /*
> > +         * Locate the xenheap using these constraints:
> > +         *
> > +         *  - must be 32 MiB aligned
> > +         *  - must not include Xen itself or the boot modules
> > +         *  - must be at most 1 GiB
> > +         *  - must be at least 128M
> > +         *
> > +         * We try to allocate the largest xenheap possible within these
> > +         * constraints.
> > +         */
> 
> this comment is not correct on ARM64, is it?

No, I'll fix it.

> > +#ifdef CONFIG_ARM_32
> >  #define is_xen_heap_page(page) is_xen_heap_mfn(page_to_mfn(page))
> >  #define is_xen_heap_mfn(mfn) ({                                 \
> >      unsigned long _mfn = (mfn);                                 \
> >      (_mfn >= xenheap_mfn_start && _mfn < xenheap_mfn_end);      \
> >  })
> 
> I take that this check wouldn't work for arm64 because it would always
> return true?

Correct.

> > +#else
> > +#define is_xen_heap_page(page) ((page)->count_info & PGC_xen_heap)
> > +#define is_xen_heap_mfn(mfn) \
> > +    (mfn_valid(mfn) && is_xen_heap_page(__mfn_to_page(mfn)))
> > +#endif
> 
> Looks like the ARM64 version of these two functions should work for
> arm32 too.

arm32 doesn't set PGC_xen_heap, I don't think. It could I suppose.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.