[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 1/3] xen/riscv: introduce setup_initial_pages



Hi Julien,

On Mon, 2023-02-27 at 17:36 +0000, Julien Grall wrote:
> Hi Oleksii,
> 
> On 27/02/2023 16:52, Oleksii wrote:
> > On Sat, 2023-02-25 at 17:53 +0000, Julien Grall wrote:
> > > > +/*
> > > > + * WARNING: load_addr() and linker_addr() are to be called
> > > > only
> > > > when the MMU is
> > > > + * disabled and only when executed by the primary CPU.  They
> > > > cannot refer to
> > > > + * any global variable or functions.
> > > 
> > > I find interesting you are saying when
> > > _setup_initial_pagetables() is
> > > called from setup_initial_pagetables(). Would you be able to
> > > explain
> > > how
> > > this is different?
> > I am not sure that I understand your question correctly but
> > _setup_initial_pagetables() was introduced to map some addresses
> > with
> > write/read flag. Probably I have to rename it to something that is
> > more
> > clear.
> 
> So the comment suggests that you code cannot refer to global 
> functions/variables when the MMU is off. So I have multiple
> questions:
>    * Why only global? IOW, why static would be OK?
>    * setup_initial_pagetables() has a call to 
> _setup_initial_pagetables() (IOW referring to another function). Why
> is 
> it fine?
>    * You have code in the next patch referring to global variables 
> (mainly _start and _end). How is this different?
> 
> > > 
> > > > + */
> > > > +
> > > > +/*
> > > > + * Convert an addressed layed out at link time to the address
> > > > where it was loaded
> > > 
> > > Typo: s/addressed/address/ ?
> > Yes, it should be address. and 'layed out' should be changed to
> > 'laid
> > out'...
> > > 
> > > > + * by the bootloader.
> > > > + */
> > > 
> > > Looking at the implementation, you seem to consider that any
> > > address
> > > not
> > > in the range [linker_addr_start, linker_addr_end[ will have a 1:1
> > > mappings.
> > > 
> > > I am not sure this is what you want. So I would consider to throw
> > > an
> > > error if such address is passed.
> > I thought that at this stage and if no relocation was done it is
> > 1:1
> > except the case when load_addr_start != linker_addr_start.
> 
> The problem is what you try to map one to one may clash with the
> linked 
> region for Xen. So it is not always possible to map the region 1:1.
> 
> Therefore, I don't see any use for the else part here.
Got it. Thanks.

I am curious than what is the correct approach in general to handle
this situation?
I mean that throw an error it is one option but if I would like to do
that w/o throwing an error. Should it done some relocation in that
case?

> 
> > 
> > 
> > > 
> > > > +#define
> > > > load_addr(linker_address)
> > > >      \
> > > > +
> > > > ({
> > > >          \
> > > > +        unsigned long __linker_address = (unsigned
> > > > long)(linker_address);      \
> > > > +        if ( linker_addr_start <= __linker_address
> > > > &&                          \
> > > > +            __linker_address < linker_addr_end
> > > > )                               \
> > > > +
> > > > {
> > > >      \
> > > > +            __linker_address
> > > > =                                                 \
> > > > +                __linker_address - linker_addr_start +
> > > > load_addr_start;        \
> > > > +
> > > > }
> > > >      \
> > > > +
> > > > __linker_address;
> > > >      \
> > > > +    })
> > > > +
> > > > +/* Convert boot-time Xen address from where it was loaded by
> > > > the
> > > > boot loader to the address it was layed out
> > > > + * at link-time.
> > > > + */
> > > 
> > > Coding style: The first line is too long and multi-line comments
> > > look
> > > like:
> > > 
> > > /*
> > >    * Foo
> > >    * Bar
> > >    */
> > > 
> > > > +#define
> > > > linker_addr(load_address)
> > > >      \
> > > 
> > > Same remark as for load_addr() above.
> > > 
> > > > +
> > > > ({
> > > >          \
> > > > +        unsigned long __load_address = (unsigned
> > > > long)(load_address);          \
> > > > +        if ( load_addr_start <= __load_address
> > > > &&                              \
> > > > +            __load_address < load_addr_end
> > > > )                                   \
> > > > +
> > > > {
> > > >      \
> > > > +            __load_address
> > > > =                                                   \
> > > > +                __load_address - load_addr_start +
> > > > linker_addr_start;          \
> > > > +
> > > > }
> > > >      \
> > > > +
> > > > __load_address;
> > > >      \
> > > > +    })
> > > > +
> > > > +static void __attribute__((section(".entry")))
> > > > +_setup_initial_pagetables(pte_t *second, pte_t *first, pte_t
> > > > *zeroeth,
> > > Can this be named to setup_initial_mapping() so this is clearer
> > > and
> > > avoid the one '_' different with the function below.
> > Sure. It will be better.
> > > 
> > > > +                         unsigned long map_start,
> > > > +                         unsigned long map_end,
> > > > +                         unsigned long pa_start,
> > > > +                         bool writable)
> > > 
> > > What about the executable bit?
> > It's always executable... But as you mentioned above
> > PTE_LEAF_DEFAULT
> > should be either RX or RW.
> > I think it makes sense to add flags instead of writable.
> > > 
> > > > +{
> > > > +    unsigned long page_addr;
> > > > +    unsigned long index2;
> > > > +    unsigned long index1;
> > > > +    unsigned long index0;
> > > 
> > > index* could be defined in the loop below.
> > It could. But I am curious why it is better?
> > > 
> > > > +
> > > > +    /* align start addresses */
> > > > +    map_start &= ZEROETH_MAP_MASK;
> > > > +    pa_start &= ZEROETH_MAP_MASK;
> > > 
> > > Hmmm... I would actually expect the address to be properly
> > > aligned
> > > and
> > > therefore not require an alignment here.
> > > 
> > > Otherwise, this raise the question of what happen if you have
> > > region
> > > using the same page?
> > That map_start &=  ZEROETH_MAP_MASK is needed to page number of
> > address
> > w/o page offset.
> 
> My point is why would the page offset be non-zero?
I checked a linker script and addresses that passed to
setup_initial_mapping() and they are really always aligned so there is
no any sense in additional alignment.

> 
> > > 
> > > > +
> > > > +    page_addr = map_start;
> > > > +    while ( page_addr < map_end )
> > > 
> > > Looking at the loop, it looks like you are assuming that the
> > > region
> > > will
> > > never cross a boundary of a page-table (either L0, L1, L2). I am
> > > not
> > > convinced you can make such assumption (see below).
> > > 
> > > But if you really want to make such assumption then you should
> > > add
> > > some
> > > guard (either BUILD_BUG_ON(), ASSERT(), proper check) in your
> > > code to
> > > avoid any surprise in the future.
> > I am not sure that I fully understand what is the problem here.
> > The address is aligned on (1<<12) boundary and each itearation is
> > mapped (1<<12) page so all looks fine or I misunderstood you.
> 
> Let's take an example, imagine the region you want to map is 4MB. 
> AFAICT, you are only passing one L0 page-table. So your code will end
> up 
> to overwrite the previous entries in the zeroeth page-table and then
> add 
> another link in the L1 page-table.
Got it. Then it looks that current approach isn't correct totally...

> 
> > > 
> > > > +    {
> > > > +        index2 = pagetable_second_index(page_addr);
> > > > +        index1 = pagetable_first_index(page_addr);
> > > > +        index0 = pagetable_zeroeth_index(page_addr);
> > > > +
> > > > +        /* Setup level2 table */
> > > > +        second[index2] = paddr_to_pte((unsigned long)first);
> > > > +        second[index2].pte |= PTE_TABLE;
> > > > +
> > > > +        /* Setup level1 table */
> > > > +        first[index1] = paddr_to_pte((unsigned long)zeroeth);
> > > > +        first[index1].pte |= PTE_TABLE;
> > > > +
> > > > +        /* Setup level0 table */
> > > > +        if ( !pte_is_valid(&zeroeth[index0]) )
> > > 
> > > Can you explain why you are checking !pte_is_valid() for the L0
> > > entry
> > > but not the other?
> > My mistake it should be checked for each level.
> 
> In which case, shouldn't you return an error if the entry is always
> valid?
> 
> > > 
> > > > +        {
> > > > +            /* Update level0 table */
> > > > +            zeroeth[index0] = paddr_to_pte((page_addr -
> > > > map_start)
> > > > + pa_start);
> > > > +            zeroeth[index0].pte |= PTE_LEAF_DEFAULT;
> > > > +            zeroeth[index0].pte &= ~((!writable) ?
> > > > PTE_WRITABLE :
> > > > 0);
> > > 
> > > Looking at the default value, it would mean that a non-writable
> > > mapping
> > > is automatically executable. This seems wrong for the section is
> > > not
> > > meant to be executable (like rodata).
> > Yes, you are right. I'll reowrk setup_initial_mapping() to pass
> > flags
> > instead of write/read - only flag.
> > > 
> > > > +        }
> > > > +
> > > > +        /* Point to next page */
> > > > +        page_addr += ZEROETH_SIZE;
> > > > +    }
> > > > +}
> > > > +
> > > > +/*
> > > > + * setup_initial_pagetables:
> > > > + *
> > > > + * 1) Build the page tables for Xen that map the following:
> > > > + *   1.1)  The physical location of Xen (where the bootloader
> > > > loaded it)
> > > > + *   1.2)  The link-time location of Xen (where the linker
> > > > expected Xen's
> > > > + *         addresses to be)
> > > > + * 2) Load the page table into the SATP and enable the MMU
> > > > + */
> > > > +void __attribute__((section(".entry")))
> > > 
> > > I couldn't find a section ".entry" in the linker.
> > > 
> > > > +setup_initial_pagetables(unsigned long load_addr_start,
> > > > +                         unsigned long load_addr_end,
> > > > +                         unsigned long linker_addr_start,
> > > > +                         unsigned long linker_addr_end)
> > > > +{
> > > > +    pte_t *second;
> > > > +    pte_t *first;
> > > > +    pte_t *zeroeth;
> > > > +
> > > > +    clear_pagetables(load_addr_start, load_addr_end,
> > > > +                     linker_addr_start, linker_addr_end);
> > > > +
> > > > +    /* Get the addresses where the page tables were loaded */
> > > > +    second  = (pte_t *)load_addr(&xen_second_pagetable);
> > > > +    first   = (pte_t *)load_addr(&xen_first_pagetable);
> > > > +    zeroeth = (pte_t *)load_addr(&xen_zeroeth_pagetable);
> > > 
> > > I would consider to embed the type cast in load_addr() so you are
> > > adding
> > > some type safety within your code.
> > Definitely it will be better but setup_initial_mapping() uses
> > 'unsigned
> > long' for passing address that should be mapped.
> 
> One possibility would be to introduce a new wrapper for the
> typesafety. 
> Anyway, it is not essential for now. Let's at least get the logic 
> correct first :).
Agree. the logic should be fixed first.
> 
> > > 
> > > > +
> > > > +    /*
> > > > +     * Create a mapping from Xen's link-time addresses to
> > > > where
> > > > they were actually loaded.
> > > 
> > > This is line is way long than 80 characters. Please make sure to
> > > wrap
> > > it
> > > 80 characters.
> > > 
> > > > +     */
> > > > +    _setup_initial_pagetables(second, first, zeroeth,
> > > > +                              linker_addr(&_stext),
> > > > +                              linker_addr(&_etext),
> > > > +                              load_addr(&_stext),
> > > > +                              false);
> > > > +    _setup_initial_pagetables(second, first, zeroeth,
> > > > +                              linker_addr(&__init_begin),
> > > > +                              linker_addr(&__init_end),
> > > > +                              load_addr(&__init_begin),
> > > > +                              true);
> > > > +    _setup_initial_pagetables(second, first, zeroeth,
> > > > +                              linker_addr(&_srodata),
> > > > +                              linker_addr(&_erodata),
> > > > +                              load_addr(&_srodata),
> > > > +                              false);
> > > > +    _setup_initial_pagetables(second, first, zeroeth,
> > > > +                              linker_addr_start,
> > > > +                              linker_addr_end,
> > > > +                              load_addr_start,
> > > > +                              true);
> > > 
> > > Where do you guarantee that Xen will always fit in an L0 table
> > > and
> > > the
> > > start address is aligned to the size of an L0 table?
> > I don't guarantee that it fit in an L0 table but the start address
> > is
> > aligned to the size of the L0 table at the start.
> Then it should be fixed.
> 
> > > 
> > > > +
> > > > +    /*
> > > > +     * Create a mapping of the load time address range to...
> > > > the
> > > > load time address range.
> > > 
> > > Same about the line length here.
> > > 
> > > > +     * This mapping is used at boot time only.
> > > > +     */
> > > > +    _setup_initial_pagetables(second, first, zeroeth,
> > > 
> > > This can only work if Xen is loaded at its linked address. So you
> > > need a
> > > separate set of L0, L1 tables for the identity mapping.
> > > 
> > > That said, this would not be sufficient because:
> > >     1) Xen may not be loaded at a 2M boundary (you can control
> > > with
> > > U-boot, but not with EFI). So this may cross a boundary and
> > > therefore
> > > need multiple pages.
> > >     2) The load region may overlap the link address
> > > 
> > > While I think it would be good to handle those cases from the
> > > start,
> > > I
> > > would understand why are not easy to solve. So I think the
> > > minimum is
> > > to
> > > throw some errors if you are in a case you can't support.
> > Do you mean to throw some error in load_addr()/linkder_addr()?
> 
> In this case, I meant to check if load_addr != linker_addr, then
> throw 
> an error.
I am not sure that it is needed now and it is easier to throw an error
but is option exist to handler situation when load_addr != linker_addr
except throwing an error? relocate?


~ Oleksii

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.