[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Minios-devel] [PATCH 29/40] arm64: init the memory system before console/xenbus/gic



On Mon, Nov 06, 2017 at 05:06:58PM +0000, Julien Grall wrote:
> Hi Shijie,
> 
> On 03/11/17 03:12, Huang Shijie wrote:
> > The mappings for console/xenbus/gic need several pages to setup
> > the page tables.
> > 
> > So we init the memory system before initiating the console/xenbus/gic,
> > and then we can allocate the pages which are needed by
> > the console/xenbus/gic mapping.
> > 
> > we use the 2M memory block for the memory.
> 
> Why 2MB and not 1GB mapping when possible? Also, what if the guest is using
2MB is a proper size for most of the memory. For example, we can use 2M
for 512MB, it is not proper to use 1GB mapping.
> less than 2MB of memory?
Please increase the guest memory, even the mini-os itself is nearly 2MB.

> 
> > 
> > Change-Id: I0d075478c107cf680d1a20989d57c0fcd727ab29
> > Jira: ENTOS-247
> > Signed-off-by: Huang Shijie <shijie.huang@xxxxxxx>
> > ---
> >   arch/arm/mm.c         | 98 
> > +++++++++++++++++++++++++++++++++++++++++++++++++++
> >   arch/arm/setup.c      |  5 +++
> >   include/arm/arch_mm.h |  1 +
> >   mm.c                  |  5 ++-
> >   4 files changed, 108 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm/mm.c b/arch/arm/mm.c
> > index edb734f..fb7886c 100644
> > --- a/arch/arm/mm.c
> > +++ b/arch/arm/mm.c
> > @@ -24,6 +24,102 @@ unsigned long allocate_ondemand(unsigned long n, 
> > unsigned long alignment)
> >       BUG();
> >   }
> > +#if defined(__aarch64__)
> > +
> > +#include <arm64/pagetable.h>
> > +
> > +extern lpae_t boot_l1_pgtable[512];
> > +
> > +static inline void set_pgt_entry(lpae_t *ptr, lpae_t val)
> > +{
> > +    *ptr = val;
> > +    dsb(ishst);
> > +    isb();
> > +}
> > +
> > +static void build_pud(lpae_t *pgd, unsigned long vaddr, unsigned long vend,
> 
> Is vend included or excluded from the range mapped?
excluded.
> 
> > +                      paddr_t phys, long mem_type,
> 
> Why the mem_type is long?
I like the long type. :)
Why use other type?

> 
> > +                      paddr_t (*new_page)(void), int level)
> > +{
> > +    lpae_t *pud;
> > +
> > +    pud = (lpae_t *)to_virt((*pgd) & ~ATTR_MASK_L) + l2_pgt_idx(vaddr);
> > +    do {
> > +        if (level == 2)
> > +             set_pgt_entry(pud, (phys & L2_MASK) | mem_type | L2_BLOCK);
> > +        vaddr += L2_SIZE;
> > +        phys += L2_SIZE;
> > +        pud++;
> 
> I am not sure to understand why you handle the third level of page-table in
The #30 is for page-size mapping, this one is for block mapping.

The two patches are used for different purpose.

> a separate patch (see #30). This is making more difficult to find whether
> something is missing.
> 
> However, I am not fully convinced the way you implement the page-tables
> support is the right way to go. Indeed you limit yourself in using the same
> level for all the range of the mapping. This is preventing a range of
> optimization such as using 1GB mapping whenever is possible and switch to
> 2MB when it is not.
> 
> Furthermore, the LPAE format has been nicely design. There are no need to
> implement each level in a separate helper. A loop would be sufficient. I can
> live with that implementation thought.
> 
> Lastly, this code only works when adding entry. We will need to remove
> entries. See unmap_frames used by munmap.
> 
> > +    } while (vaddr < vend);
> > +}
> > +
> > +/* Setup the page table when the MMU is enabled */
> 
> I am not sure to understand this comment.
This function is called after the MMU is enabled.
> 
> > +void build_pagetable(unsigned long vaddr, unsigned long start_pfn,
> 
> Why is that exposed outside mm.c?
I will change it to static.

> 
> > +                     unsigned long max_pfn, long mem_type,
> > +                     paddr_t (*new_page)(void), int level)
> > +{
> > +    paddr_t p_start;
> > +    unsigned long v_end, next;
> > +    lpae_t *pgd;
> > +
> > +    v_end = vaddr + max_pfn * PAGE_SIZE;
> > +    p_start = PFN_PHYS(start_pfn);
> > +
> > +    /* The boot_l1_pgtable can span 512G address space */
> > +    pgd = &boot_l1_pgtable[l1_pgt_idx(vaddr)];
> > +
> > +    do {
> > +        next = (vaddr + L1_SIZE);
> > +        if (next > v_end)
> > +            next = v_end;
> > +
> > +        if ((*pgd) == L1_INVAL) {
> > +            set_pgt_entry(pgd, (new_page()) | PT_PT);
> 
> What if new_page() fails?
It will call the BUG() when it fails, please see alloc_new_page()..
> 
> > +        }
> > +
> > +        build_pud(pgd, vaddr, next, p_start, mem_type, new_page, level);
> > +        p_start += next - vaddr;
> > +        vaddr = next;
> > +        pgd++;
> > +    } while (vaddr != v_end);
> > +}
> 
> The code you implemented for page-table looks far too simple. For instance,
> I am surprised there are no TLB maintenance anywhere.
We can add the TLB maintenance in future in the places it needed.
> 
> > +
> > +static unsigned long first_free_pfn;
> 
> Likely this variable and get_new_page needs more explanation in the code.
okay, I stole this code from the x86 in mini-os.

> 
> > +static paddr_t get_new_page(void)
> 
> I think it would be better if you had early (or something similar) in the
> name. To make clear that is only used for early boot.
What's about get_new_page_early or get_new_page_boot?

> 
> > +{
> > +    paddr_t new_page;
> > +
> > +    memset(pfn_to_virt(first_free_pfn), 0, PAGE_SIZE);
> 
> This is quite confusing, get_new_page() allocates page-table that
> will be used to map the RAM. So how do you bootstrap it?
sorry, I do not quite understand your meaning..

Do you worry about the virtual address of first_free_pfn?

> 
> > +    dsb(ishst);
> > +
> > +    new_page = PFN_PHYS(first_free_pfn);
> > +    first_free_pfn++;
> 
> What if you exhaust in memory? AFAICT, the user will see a failure like a
It will never happen. Each page will used for 1G. If we have 4G memory,
we will only use 4 page. 
 
Please just think it again..    

> data abort. This will not be very obvious to track.

> 
> > +    return new_page;
> > +}
> > +
> > +/*
> > + * This function will setup the page table for the memory system.
> > + *
> > + * Note: We get the page for page table from the first free PFN,
> 
> s/get/allocate/
okay.

> 
> > + *       the get_new_page() will increase the @first_free_pfn.
> 
> But this comment would have been better on top of
> get_new_page/first_free_pfn.
okay..
> 
> > + */
> > +void init_pagetable(unsigned long *start_pfn, unsigned long base_pfn,
> > +                    unsigned long max_pfn)
> > +{
> > +    first_free_pfn = *start_pfn;
> > +
> > +    build_pagetable((unsigned long)pfn_to_virt(base_pfn), base_pfn,
> > +               max_pfn, BLOCK_DEF_ATTR, get_new_page, 2);
> > +    *start_pfn = first_free_pfn;
> > +}
> > +
> > +#else
> > +void init_pagetable(unsigned long *start_pfn, unsigned long base_pfn,
> > +                    unsigned long max_pfn)
> > +{
> > +}
> > +#endif
> > +
> >   void arch_init_mm(unsigned long *start_pfn_p, unsigned long *max_pfn_p)
> >   {
> >       int memory;
> > @@ -74,6 +170,8 @@ void arch_init_mm(unsigned long *start_pfn_p, unsigned 
> > long *max_pfn_p)
> >       heap_len = mem_size - (PFN_PHYS(*start_pfn_p) - mem_base);
> >       *max_pfn_p = *start_pfn_p + PFN_DOWN(heap_len);
> > +    init_pagetable(start_pfn_p, PHYS_PFN(mem_base), PHYS_PFN(mem_size));
> > +
> >       printk("Using pages %lu to %lu as free space for heap.\n", 
> > *start_pfn_p, *max_pfn_p);
> >       /* The device tree is probably in memory that we're about to hand 
> > over to the page
> > diff --git a/arch/arm/setup.c b/arch/arm/setup.c
> > index bde30c6..61daca3 100644
> > --- a/arch/arm/setup.c
> > +++ b/arch/arm/setup.c
> > @@ -41,6 +41,11 @@ void arch_init(void *dtb_pointer, paddr_t 
> > physical_offset)
> >       /* Map shared_info page */
> >       HYPERVISOR_shared_info = map_shared_info(NULL);
> > +#if defined(__aarch64__)
> 
> Carve out from arch_init_mm what is necessary but don't call init_mm() early
> just for aarch64. See what x86 did with arch_mm_preinit().
Thanks, I will check that.
> 
> > +    /* We need to init the memory, and then init the console/xenbus/gic */
> > +    init_mm();
> > +#endif

Thanks
Huang Shijie

_______________________________________________
Minios-devel mailing list
Minios-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/cgi-bin/mailman/listinfo/minios-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.