[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 1/3] xen/riscv: introduce setup_mm()



> > > 
> > > > 
> 
> > Finally, regarding masking off the top bits of mfn, I'm not
> > entirely
> > clear on how this should work. If I understand correctly, if I mask
> > off
> > certain top bits in mfn, then I would need to unmask those same top
> > bits in maddr_to_virt() and virt_to_maddr(). Is that correct?
> > 
> > Another point I’m unclear on is which specific part of the top bits
> > should be masked.
> 
> You want to "move" the directmap such that the first legitimate RAM
> page is within the first (large/huge) page mapping of the directmap.
> IOW the "virtual" start of the directmap would move down in VA space.
> That still leaves things at a simple offset calculation when
> translating VA <-> PA.
> 
> To give an example: Let's assume RAM starts at 61.5 Gb, and you want
> to
> use 1Gb mappings for the bulk of the directmap. Then the "virtual"
> start
> of the directmap would shift down to DIRECTMAP_VIRT_START - 60Gb,
> such that the first RAM page would be mapped at
> DIRECTMAP_VIRT_START + 1.5Gb. IOW it would be the low 30 address bits
> of
> the start address that you use (30 - PAGE_SHIFT for the MFN), with
> the
> higher bits contributing to the offset involved in the VA <-> PA
> translation. Values used depend on the (largest) page size you mean
> to
> use for the direct map: On systems with terabytes of memory
> (demanding
> Sv48 or even Sv57 mode) you may want to use 512Gb mappings, and hence
> you'd then need to mask the low 39 bits (or 48 for 256Tb mappings).
Thanks a lot for clarification. IIUC then not to many things should be
changed, only directmap mapping virtual address and calculation of
proper virtual address start of directmap:
   --- a/xen/arch/riscv/mm.c
   +++ b/xen/arch/riscv/mm.c
   @@ -457,6 +457,7 @@ vaddr_t __ro_after_init directmap_virt_start;
    static void __init setup_directmap_mappings(unsigned long base_mfn,
                                                unsigned long nr_mfns)
    {
   +    unsigned long base_addr = mfn_to_maddr(_mfn(base_mfn));
        int rc;
    
        /* First call sets the directmap physical and virtual offset. */
   @@ -473,14 +474,14 @@ static void __init
   setup_directmap_mappings(unsigned long base_mfn,
             * Prevent that by offsetting the start of the directmap
   virtual
             * address.
             */
   -        directmap_virt_start = DIRECTMAP_VIRT_START +
   pfn_to_paddr(base_mfn);
   +        directmap_virt_start = DIRECTMAP_VIRT_START - (base_addr &
   ~XEN_PT_LEVEL_SIZE(HYP_PT_ROOT_LEVEL));
        }
    
        if ( base_mfn < mfn_x(directmap_mfn_start) )
            panic("cannot add directmap mapping at %#lx below heap start
   %#lx\n",
                  base_mfn, mfn_x(directmap_mfn_start));
    
   -    rc = map_pages_to_xen((vaddr_t)mfn_to_virt(base_mfn),
   +    rc = map_pages_to_xen(DIRECTMAP_VIRT_START + (base_addr &
   XEN_PT_LEVEL_SIZE(HYP_PT_ROOT_LEVEL)),
                              _mfn(base_mfn), nr_mfns,
                              PAGE_HYPERVISOR_RW);
   

And of course then use directmap_virt_start in maddr_to_virt() and
virt_to_maddr():
   @@ -31,7 +31,7 @@ static inline void *maddr_to_virt(paddr_t ma)
    
        ASSERT(va_offset < DIRECTMAP_SIZE);
    
   -    return (void *)(XENHEAP_VIRT_START + va_offset);
   +    return (void *)(directmap_virt_start + va_offset);
    }
    
    /*
   @@ -44,7 +44,7 @@ static inline unsigned long virt_to_maddr(unsigned
   long va)
    {
        if ((va >= XENHEAP_VIRT_START) &&
            (va < (DIRECTMAP_VIRT_START + DIRECTMAP_SIZE)))
   -        return directmapoff_to_maddr(va - XENHEAP_VIRT_START);
   +        return directmapoff_to_maddr(va - directmap_virt_start);

> 
> > > > +void __init setup_mm(void)
> > > > +{
> > > > +    const struct membanks *banks = bootinfo_get_mem();
> > > > +    paddr_t ram_start = INVALID_PADDR;
> > > > +    paddr_t ram_end = 0;
> > > > +    paddr_t ram_size = 0;
> > > > +    unsigned int i;
> > > > +
> > > > +    /*
> > > > +     * We need some memory to allocate the page-tables used
> > > > for
> > > > the directmap
> > > > +     * mappings. But some regions may contain memory already
> > > > allocated
> > > > +     * for other uses (e.g. modules, reserved-memory...).
> > > > +     *
> > > > +     * For simplicity, add all the free regions in the boot
> > > > allocator.
> > > > +     */
> > > > +    populate_boot_allocator();
> > > > +
> > > > +    total_pages = 0;
> > > > +
> > > > +    for ( i = 0; i < banks->nr_banks; i++ )
> > > > +    {
> > > > +        const struct membank *bank = &banks->bank[i];
> > > > +        paddr_t bank_end = bank->start + bank->size;
> > > > +
> > > > +        ram_size += ROUNDDOWN(bank->size, PAGE_SIZE);
> > > 
> > > As before - if a bank doesn't cover full pages, this may give the
> > > impression
> > > of there being more "total pages" than there are.
> > Since it rounds down to PAGE_SIZE, if ram_start is 2K and the total
> > size of a bank is 11K, ram_size will end up being 8K, so the "total
> > pages" will cover less RAM than the actual size of the RAM bank.
> 
> ram_start at 2k but bank size being 13k would yield 2 usable pages
> (first partial page of 2k unusable and last partial page of 3k
> unusable), yet ram_size of 12k (3 pages). You need to consider the
> common case; checking things work for a randomly chosen example isn't
> enough.
Then I have to check separately the start and end of bank and check if
ram_size should be reduced in case if the start or end isn't properly
aligned.

> 
> > > > +        ram_start = min(ram_start, bank->start);
> > > > +        ram_end = max(ram_end, bank_end);
> > > > +
> > > > +        setup_directmap_mappings(PFN_DOWN(bank->start),
> > > > +                                 PFN_DOWN(bank->size));
> > > 
> > > Similarly I don't think this is right when both start and size
> > > aren't
> > > multiple of PAGE_SIZE. You may map an unsuable partial page at
> > > the
> > > start,
> > > and then fail to map a fully usable page at the end.
> > ram_size should be a multiple of PAGE_SIZE because we have:
> >     ram_size += ROUNDDOWN(bank->size, PAGE_SIZE);
> > 
> > Do you know of any examples where bank->start isn't aligned to
> > PAGE_SIZE?
> 
> Question is the other way around: Is it specified anywhere that start
> (and
> size) _need_ to be aligned? And if it is - do all firmware makers
> play by
> that (on x86 at least specifications often mean pretty little to
> firmware
> people, apparently)?
Yes, I understand that, I tried to find that somewhere in priv/unpriv
spec and wasn't able to find that. And that is why I asked if it should
be mentioned somewhere. Anyway, I think that it will be better just to
update the code and make it working in any case.

Thanks.

~ Oleksii




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.