[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 1/3] xen/riscv: introduce setup_mm()



On Tue, 2024-11-12 at 12:22 +0100, Jan Beulich wrote:
> On 11.11.2024 19:16, Oleksii Kurochko wrote:
> > @@ -25,8 +27,11 @@
> >  
> >  static inline void *maddr_to_virt(paddr_t ma)
> >  {
> > -    BUG_ON("unimplemented");
> > -    return NULL;
> > +    unsigned long va_offset = maddr_to_directmapoff(ma);
> > +
> > +    ASSERT(va_offset < DIRECTMAP_SIZE);
> 
> Much like with the consideration towards virt_to_maddr() that was
> corrected from v4, I think this one also needs adjusting:
> 
>     ASSERT(va_offset < DIRECTMAP_SIZE + (DIRECTMAP_VIRT_START -
>                                          directmap_virt_start));
> 
> This is because ...
> 
> > +    return (void *)(directmap_virt_start + va_offset);
> 
> ... you're offsetting the VA here. It may then want accompanying
> by
> 
>     ASSERT(va_offset >= DIRECTMAP_VIRT_START - directmap_virt_start);
> 
> (probably to go first).
I might be misunderstanding something, but shouldn't the assertions be
as follows?
   DIRECTMAP_VIRT_START <= (directmap_virt_start + va_offset) <
   DIRECTMAP_VIRT_SIZE

So, va_offset should be in the range:
   DIRECTMAP_VIRT_START - direcmap_virt_start <= va_offset <
   DIRECTMAP_VIRT_SIZE - directmap_virt_start
   
This would make the assertions look like:
   ASSERT(va_offset < DIRECTMAP_VIRT_SIZE - directmap_virt_tart)
   ASSERT(va_offset >= DIRECTMAP_VIRT_START - directmap_virt_start)

So, we have different check for the first ASSERT. Could you please
clarify where my calculations are wrong?

> 
> > @@ -423,3 +429,147 @@ void * __init early_fdt_map(paddr_t
> > fdt_paddr)
> >  
> >      return fdt_virt;
> >  }
> > +
> > +vaddr_t __ro_after_init directmap_virt_start =
> > DIRECTMAP_VIRT_START;
> > +
> > +struct page_info *__ro_after_init frametable_virt_start =
> > frame_table;
> > +
> > +#ifndef CONFIG_RISCV_32
> > +
> > +/* Map a frame table to cover physical addresses ps through pe */
> > +static void __init setup_frametable_mappings(paddr_t ps, paddr_t
> > pe)
> > +{
> > +    static mfn_t __initdata frametable_mfn_start =
> > INVALID_MFN_INITIALIZER;
> > +
> > +    paddr_t aligned_ps = ROUNDUP(ps, PAGE_SIZE);
> > +    paddr_t aligned_pe = ROUNDDOWN(pe, PAGE_SIZE);
> > +    unsigned long nr_mfns = PFN_DOWN(aligned_pe - aligned_ps);
> > +    unsigned long frametable_size = nr_mfns *
> > sizeof(*frame_table);
> > +    mfn_t base_mfn;
> > +
> > +    if ( mfn_eq(frametable_mfn_start, INVALID_MFN) )
> > +    {
> > +        frametable_mfn_start = maddr_to_mfn(aligned_ps);
> > +
> > +        frametable_virt_start -= paddr_to_pfn(aligned_ps);
> > +    }
> > +    else
> > +        panic("%s shouldn't be called twice\n", __func__);
> 
> As said on the v4 thread - I don't think this is needed. Aiui Misra
> would
> actually dislike it, as it's unreachable code. Just to re-iterate: My
> complaint there wasn't about this missing check, but about the
> function
> partly giving the impression of expecting to be called more than
> once.
I’ll revert this check, then. Would it be better—and sufficient—to add
a comment before setup_frametable_mappings() indicating that this
function should only be called once, to avoid any impression that it
might be expected to be called multiple times?

> 
> > +    if ( frametable_size > FRAMETABLE_SIZE )
> > +        panic("The frametable cannot cover [%#"PRIpaddr",
> > %#"PRIpaddr")\n",
> > +              ps, pe);
> > +
> > +    /*
> > +     * align base_mfn and frametable_size to MB(2) to have
> > superpage mapping
> > +     * in map_pages_to_xen()
> > +     */
> > +    frametable_size = ROUNDUP(frametable_size, MB(2));
> > +    base_mfn = alloc_boot_pages(frametable_size >> PAGE_SHIFT,
> > PFN_DOWN(MB(2)));
> 
> As you already use PFN_DOWN() once, why do you open-code it for the
> other
> argument?
Just overlooked that PFN_DOWN() could be used here.

>  You also use it ...
> 
> > +    if ( map_pages_to_xen(FRAMETABLE_VIRT_START, base_mfn,
> > +                          PFN_DOWN(frametable_size),
> 
> ... here, where the purpose of the argument is exactly the same.
> 
> > +void __init setup_mm(void)
> > +{
> > +    const struct membanks *banks = bootinfo_get_mem();
> > +    paddr_t ram_start = INVALID_PADDR;
> > +    paddr_t ram_end = 0;
> > +    unsigned int i;
> > +
> > +    /*
> > +     * We need some memory to allocate the page-tables used for
> > the directmap
> > +     * mappings. But some regions may contain memory already
> > allocated
> > +     * for other uses (e.g. modules, reserved-memory...).
> > +     *
> > +     * For simplicity, add all the free regions in the boot
> > allocator.
> > +     */
> > +    populate_boot_allocator();
> > +
> > +    for ( i = 0; i < banks->nr_banks; i++ )
> > +    {
> > +        const struct membank *bank = &banks->bank[i];
> > +        paddr_t bank_start = ROUNDUP(bank->start, PAGE_SIZE);
> > +        paddr_t bank_end = ROUNDDOWN(bank->start + bank->size,
> > PAGE_SIZE);
> > +        unsigned long bank_size = bank_end - bank_start;
> > +
> > +        ram_start = min(ram_start, bank_start);
> > +        ram_end = max(ram_end, bank_end);
> > +
> > +        setup_directmap_mappings(PFN_DOWN(bank_start),
> > PFN_DOWN(bank_size));
> > +    }
> > +
> > +    setup_frametable_mappings(ram_start, ram_end);
> 
> Just to double check: There is a guarantee that ->nr_banks isn't
> going to
> be zero? Else the setup_frametable_mappings() invocation here would
> badly
> degenerate.
Based on the what is mentioned in the device tree spec I would say yes
but based on the code we have in Xen ( and if I am not confusing
something ) it is still a chance that ->nr_banks could be zero.

>From the device tree spec:
```
A memory device node is required for all devicetrees and describes the
physical memory layout for the system. If a system
has multiple ranges of memory, multiple memory nodes can be created, or
the ranges can be specified in the reg property
of a single memory node.

Property Name  Usage   Value Type           Definition
device_type     R      <string>             Value shall be “memory”
reg             R      <prop-encoded-array> ...

Usage legend: R=Required, O=Optional, OR=Optional but Recommended,
SD=See Definition
```

So the memory node should present in devicetree ( interesting that few
years ago I remember that I could build devicetree w/o memory node,
anyway, Xen cheking a reg property of memory node and panic if
something wrong ).

In the same time, based on the code of device_tree_get_meminfo() (
xen/common/device-tree/bootfdt.c:169 ):
       prop = fdt_get_property(fdt, node, prop_name, NULL);
       if ( !prop )
           return -ENOENT;
   
       cell = (const __be32 *)prop->data;
       banks = fdt32_to_cpu(prop->len) / (reg_cells * sizeof (u32));
   
       for ( i = 0; i < banks && mem->nr_banks < mem->max_banks; i++ )
       {
           device_tree_get_reg(&cell, address_cells, size_cells,
   &start, &size);
           if ( mem == bootinfo_get_reserved_mem() &&
                check_reserved_regions_overlap(start, size) )
               return -EINVAL;
           /* Some DT may describe empty bank, ignore them */
           if ( !size )
               continue;
           mem->bank[mem->nr_banks].start = start;
           mem->bank[mem->nr_banks].size = size;
           mem->bank[mem->nr_banks].type = type;
           mem->nr_banks++;
       }
   
It is possible that mem->nr_banks is equal to 0 ( when DT describe only
empty banks? ) but then it sounds to me like something wrong with DT.

Just to be sure that everything is okay with ->nr_banks I could suggest
to add BUG_ON(banks->nr_banks) in setup_mm() before the `for` cycle.
Does it make sense?

Thanks.

~ Oleksii



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.