[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Minios-devel] [PATCH 29/40] arm64: init the memory system before console/xenbus/gic



Hi Shijie,

On 08/11/17 08:54, Huang Shijie wrote:
> On Mon, Nov 06, 2017 at 05:06:58PM +0000, Julien Grall wrote:
>> Hi Shijie,
>>
>> On 03/11/17 03:12, Huang Shijie wrote:
>>> The mappings for console/xenbus/gic need several pages to setup
>>> the page tables.
>>>
>>> So we init the memory system before initiating the console/xenbus/gic,
>>> and then we can allocate the pages which are needed by
>>> the console/xenbus/gic mapping.
>>>
>>> we use the 2M memory block for the memory.
>>
>> Why 2MB and not 1GB mapping when possible? Also, what if the guest is using
> 2MB is a proper size for most of the memory. For example, we can use 2M
> for 512MB, it is not proper to use 1GB mapping.

And you can be clever and mixing 2MB/1GB mapping depending on what you need to 
do.

>> less than 2MB of memory?
> Please increase the guest memory, even the mini-os itself is nearly 2MB.

Really? I can't build mini-os arm64 due to compilation issue:

In file included from dtc/libfdt/fdt.c:54:0:
mini-os/include/libfdt.h:54:24: fatal error: libfdt_env.h: No such file or 
directory
 #include <libfdt_env.h>
                        ^
But building x86, the resulting image is only 320K. How come the arm64 image
is nearly 10 times bigger?

> 
>>
>>>
>>> Change-Id: I0d075478c107cf680d1a20989d57c0fcd727ab29
>>> Jira: ENTOS-247
>>> Signed-off-by: Huang Shijie <shijie.huang@xxxxxxx>
>>> ---
>>>    arch/arm/mm.c         | 98 
>>> +++++++++++++++++++++++++++++++++++++++++++++++++++
>>>    arch/arm/setup.c      |  5 +++
>>>    include/arm/arch_mm.h |  1 +
>>>    mm.c                  |  5 ++-
>>>    4 files changed, 108 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm/mm.c b/arch/arm/mm.c
>>> index edb734f..fb7886c 100644
>>> --- a/arch/arm/mm.c
>>> +++ b/arch/arm/mm.c
>>> @@ -24,6 +24,102 @@ unsigned long allocate_ondemand(unsigned long n, 
>>> unsigned long alignment)
>>>        BUG();
>>>    }
>>> +#if defined(__aarch64__)
>>> +
>>> +#include <arm64/pagetable.h>
>>> +
>>> +extern lpae_t boot_l1_pgtable[512];
>>> +
>>> +static inline void set_pgt_entry(lpae_t *ptr, lpae_t val)
>>> +{
>>> +    *ptr = val;
>>> +    dsb(ishst);
>>> +    isb();
>>> +}
>>> +
>>> +static void build_pud(lpae_t *pgd, unsigned long vaddr, unsigned long vend,
>>
>> Is vend included or excluded from the range mapped?
> excluded.
>>
>>> +                      paddr_t phys, long mem_type,
>>
>> Why the mem_type is long?
> I like the long type. :)
> Why use other type?

Whether you like or not mem_type, as the name says, is a type. Type are not 
signed.
The way you use it all sort of problem arise when you use shift or the value
needs to be converted to a bigger type.

Now, you or it in the page-table. This means you expect that value to contain 
some
information for the LPAE entry. An entry is 64-bit. So it should be uint64_t.

> 
>>
>>> +                      paddr_t (*new_page)(void), int level)
>>> +{
>>> +    lpae_t *pud;
>>> +
>>> +    pud = (lpae_t *)to_virt((*pgd) & ~ATTR_MASK_L) + l2_pgt_idx(vaddr);
>>> +    do {
>>> +        if (level == 2)
>>> +             set_pgt_entry(pud, (phys & L2_MASK) | mem_type | L2_BLOCK);
>>> +        vaddr += L2_SIZE;
>>> +        phys += L2_SIZE;
>>> +        pud++;
>>
>> I am not sure to understand why you handle the third level of page-table in
> The #30 is for page-size mapping, this one is for block mapping.
> 
> The two patches are used for different purpose.

While the purpose of using 4KB and 2MB mappings may be different in the Mini-OS
context. Page-table are just page-table. You have helpers to build them and they
will not changed. If they need, then you clearly did something wrong.

Furthermore, as I already said one patch to handle all levels of page-table is
just easier to review.

[...]

>>
>>> +                     unsigned long max_pfn, long mem_type,
>>> +                     paddr_t (*new_page)(void), int level)
>>> +{
>>> +    paddr_t p_start;
>>> +    unsigned long v_end, next;
>>> +    lpae_t *pgd;
>>> +
>>> +    v_end = vaddr + max_pfn * PAGE_SIZE;
>>> +    p_start = PFN_PHYS(start_pfn);
>>> +
>>> +    /* The boot_l1_pgtable can span 512G address space */
>>> +    pgd = &boot_l1_pgtable[l1_pgt_idx(vaddr)];
>>> +
>>> +    do {
>>> +        next = (vaddr + L1_SIZE);
>>> +        if (next > v_end)
>>> +            next = v_end;
>>> +
>>> +        if ((*pgd) == L1_INVAL) {
>>> +            set_pgt_entry(pgd, (new_page()) | PT_PT);
>>
>> What if new_page() fails?
> It will call the BUG() when it fails, please see alloc_new_page()..

I looked at the only caller in this patch and there are no BUG()...

So your interface is already wrong (see below). But a BUG() in alloc_new_page()
seem pretty odd. You want to let the caller decides what to do if the allocation
fails? In fact even the x86 do return an error on allocation failure. So why not
Arm?

>>
>>> +        }
>>> +
>>> +        build_pud(pgd, vaddr, next, p_start, mem_type, new_page, level);
>>> +        p_start += next - vaddr;
>>> +        vaddr = next;
>>> +        pgd++;
>>> +    } while (vaddr != v_end);
>>> +}
>>
>> The code you implemented for page-table looks far too simple. For instance,
>> I am surprised there are no TLB maintenance anywhere.
> We can add the TLB maintenance in future in the places it needed.

When is that future? As I said previously, page-table are going to be used
in multiple difference places. Implementing the page-table correctly from
the beginning does not hurt and will save time later on for all of us later
on.

>>
>>> +
>>> +static unsigned long first_free_pfn;
>>
>> Likely this variable and get_new_page needs more explanation in the code.
> okay, I stole this code from the x86 in mini-os.

Sorry but I can't find this code in x86... The variable is here but seem to
be used differently. Would you mind giving a pointer?

> 
>>
>>> +static paddr_t get_new_page(void)
>>
>> I think it would be better if you had early (or something similar) in the
>> name. To make clear that is only used for early boot.
> What's about get_new_page_early or get_new_page_boot?

I would prefer early_alloc_page()

> 
>>
>>> +{
>>> +    paddr_t new_page;
>>> +
>>> +    memset(pfn_to_virt(first_free_pfn), 0, PAGE_SIZE);
>>
>> This is quite confusing, get_new_page() allocates page-table that
>> will be used to map the RAM. So how do you bootstrap it?
> sorry, I do not quite understand your meaning..
> 
> Do you worry about the virtual address of first_free_pfn?

Yes. Where does the mapping between first_free_pfn and its virtual address
is done?

> 
>>
>>> +    dsb(ishst);
>>> +
>>> +    new_page = PFN_PHYS(first_free_pfn);
>>> +    first_free_pfn++;
>>
>> What if you exhaust in memory? AFAICT, the user will see a failure like a
> It will never happen. Each page will used for 1G. If we have 4G memory,
> we will only use 4 page.

The practice is often different from the theory... You make your assumption
based on the current use and now how it could potentially be used.

There is strictly no harm to add an ASSERT() checking this assumption. This
will save people a lot of debugging later on.

>   
> Please just think it again..

Well, I am only trying to help moving this series to a maintainable and 
debuggable
state...

Cheers,

-- 
Julien Grall

_______________________________________________
Minios-devel mailing list
Minios-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/cgi-bin/mailman/listinfo/minios-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.