[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 5/8] xen/riscv: introduce identity mapping


  • To: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 12 Jun 2023 15:48:23 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=THykHUDivjtyj8SV40sWxFCHsjWEitcQNrdaK+5l+Nk=; b=Tv0yKwudQ/tN3g2BiVIFXMp0f90P+MO6Wv/0jBz0ELvAPJhCbn4Qh9tl8p4uGx9z91Ef26aVGxeWOTbcGdEOHfsi+4y8C+W+YtXtlrH7IhOI4bJ8ibzv0ie85/hATYtdCahgZUNUf2EYAWD611T0t87mQwCe330fXeQ4arOaVa9wczveVXqgaZvgGoPsavYe0kpk65oeGJrkuo3isFwhM6c2y9HDwnf42J/wA2jQP1o+xMXMH1QrFoGqYalIBGJr2FCszBjF1rzxBrDMF/XSy2MSJa5XdVaFsHVNL+iXmzyhEKnIejKtk4fupx/YyanBk3WZUSnFuGLOthAWAPzCgQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Xt5XTC+2dpfMrDOC2ukXbARp/WKeIM8ROU93AJUfsMWf61qEWGo6aXBy5fVE/s8Gg3CIoVwSxoNgGEIR1rdkupj8/9Z5kC+GpAfDo2Pt9wUoyAA1ChPHtlSHBZtEjdn1D1xCN2UxssyIiIfsF4NLWoXbui/4NFvngg0f7XFCW2ydbB2zzJDxHEq4dPxP4dtFZASxdxo5MV7o+25k3ieBHWcyypAH6sDjNX2/Msh7vGT0Olavf6DlVo2GTzpk1tY/fmmwK5KgMLVKPX5Ue19rDuhKruJeCO4Fb4RoQ+ohLbK7fVzbzohY+wrBPpEoTso4ZbBay5DTuVAVvEBk3v/8Fg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Bob Eshleman <bobbyeshleman@xxxxxxxxx>, Alistair Francis <alistair.francis@xxxxxxx>, Connor Davis <connojdavis@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 12 Jun 2023 13:48:45 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 06.06.2023 21:55, Oleksii Kurochko wrote:
> The way how switch to virtual address was implemented in the
> commit e66003e7be ("xen/riscv: introduce setup_initial_pages")
> wasn't safe enough so identity mapping was introduced and
> used.

I don't think this is sufficient as a description. You want to make
clear what the "not safe enough" is, and you also want to go into
more detail as to the solution chosen. I'm particularly puzzled that
you map just two singular pages ...

> @@ -35,8 +40,10 @@ static unsigned long phys_offset;
>   *
>   * It might be needed one more page table in case when Xen load address
>   * isn't 2 MB aligned.
> + *
> + * 3 additional page tables are needed for identity mapping.
>   */
> -#define PGTBL_INITIAL_COUNT ((CONFIG_PAGING_LEVELS - 1) + 1)
> +#define PGTBL_INITIAL_COUNT ((CONFIG_PAGING_LEVELS - 1) + 1 + 3)

What is this 3 coming from? It feels like the value should (again)
somehow depend on CONFIG_PAGING_LEVELS.

> @@ -108,16 +116,18 @@ static void __init setup_initial_mapping(struct 
> mmu_desc *mmu_desc,
>              {
>                  unsigned long paddr = (page_addr - map_start) + pa_start;
>                  unsigned int permissions = PTE_LEAF_DEFAULT;
> +                unsigned long addr = (is_identity_mapping) ?

Nit: No need for parentheses here.

> +                                     page_addr : LINK_TO_LOAD(page_addr);

As a remark, while we want binary operators at the end of lines when
wrapping, we usually do things differently for the ternary operator:
Either

                unsigned long addr = is_identity_mapping
                                     ? page_addr : LINK_TO_LOAD(page_addr);

or

                unsigned long addr = is_identity_mapping
                                     ? page_addr
                                     : LINK_TO_LOAD(page_addr);

.

> @@ -232,22 +242,27 @@ void __init setup_initial_pagetables(void)
>                            linker_start,
>                            linker_end,
>                            load_start);
> +
> +    if ( linker_start == load_start )
> +        return;
> +
> +    setup_initial_mapping(&mmu_desc,
> +                          load_start,
> +                          load_start + PAGE_SIZE,
> +                          load_start);
> +
> +    setup_initial_mapping(&mmu_desc,
> +                          (unsigned long)cpu0_boot_stack,
> +                          (unsigned long)cpu0_boot_stack + PAGE_SIZE,

Shouldn't this be STACK_SIZE (and then also be prepared for
STACK_SIZE > PAGE_SIZE)?

> +                          (unsigned long)cpu0_boot_stack);
>  }
>  
> -void __init noreturn noinline enable_mmu()
> +/*
> + * enable_mmu() can't be __init because __init section isn't part of identity
> + * mapping so it will cause an issue after MMU will be enabled.
> + */

As hinted at above already - perhaps the identity mapping wants to be
larger, up to covering the entire Xen image? Since it's temporary
only anyway, you could even consider using a large page (and RWX
permission). You already require no overlap of link and load addresses,
so at least small page mappings ought to be possible for the entire
image.

> @@ -255,25 +270,41 @@ void __init noreturn noinline enable_mmu()
>      csr_write(CSR_SATP,
>                PFN_DOWN((unsigned long)stage1_pgtbl_root) |
>                RV_STAGE1_MODE << SATP_MODE_SHIFT);
> +}
> +
> +void __init remove_identity_mapping(void)
> +{
> +    int i, j;

Nit: unsigned int please.

> +    pte_t *pgtbl;
> +    unsigned int index, xen_index;

These would all probably better be declared in the narrowest possible
scope.

> -    asm volatile ( ".p2align 2" );
> - mmu_is_enabled:
>      /*
> -     * Stack should be re-inited as:
> -     * 1. Right now an address of the stack is relative to load time
> -     *    addresses what will cause an issue in case of load start address
> -     *    isn't equal to linker start address.
> -     * 2. Addresses in stack are all load time relative which can be an
> -     *    issue in case when load start address isn't equal to linker
> -     *    start address.
> -     *
> -     * We can't return to the caller because the stack was reseted
> -     * and it may have stash some variable on the stack.
> -     * Jump to a brand new function as the stack was reseted
> +     * id_addrs should be in sync with id mapping in
> +     * setup_initial_pagetables()

What is "id" meant to stand for here? Also if things need keeping in
sync, then a similar comment should exist on the other side.

>       */
> +    unsigned long id_addrs[] =  {
> +                                 LINK_TO_LOAD(_start),
> +                                 LINK_TO_LOAD(cpu0_boot_stack),
> +                                };
>  
> -    switch_stack_and_jump((unsigned long)cpu0_boot_stack + STACK_SIZE,
> -                          cont_after_mmu_is_enabled);
> +    pgtbl = stage1_pgtbl_root;
> +
> +    for ( j = 0; j < ARRAY_SIZE(id_addrs); j++ )
> +    {
> +        for ( pgtbl = stage1_pgtbl_root, i = CONFIG_PAGING_LEVELS - 1; i >= 
> 0; i-- )
> +        {
> +            index = pt_index(i, id_addrs[j]);
> +            xen_index = pt_index(i, XEN_VIRT_START);
> +
> +            if ( index != xen_index )
> +            {
> +                pgtbl[index].pte = 0;
> +                break;
> +            }

Up to here I understand index specifies a particular PTE within pgtbl[].

> +            pgtbl = &pgtbl[index];

But then how can this be the continuation of the page table walk? Don't
you need to read the address out of the PTE?

> --- a/xen/arch/riscv/riscv64/head.S
> +++ b/xen/arch/riscv/riscv64/head.S
> @@ -31,6 +31,36 @@ ENTRY(start)
>  
>          jal     calc_phys_offset
>  
> +        jal     setup_initial_pagetables
> +
> +        jal     enable_mmu
> +
> +        /*
> +         * Calculate physical offset
> +         *
> +         * We can't re-use a value in phys_offset variable here as
> +         * phys_offset is located in .bss and this section isn't
> +         * 1:1 mapped and an access to it will cause MMU fault
> +         */
> +        li      t0, XEN_VIRT_START
> +        la      t1, start
> +        sub     t1, t1, t0

If I'm not mistaken this calculates start - XEN_VIRT_START, and ...

> +        /* Calculate proper VA after jump from 1:1 mapping */
> +        la      t0, .L_primary_switched
> +        sub     t0, t0, t1

... then this .L_primary_switched - (start - XEN_VIRT_START). Which can
also be expressed as (.L_primary_switched - start) + XEN_VIRT_START,
the first part of which gas ought to be able to resolve for you. But
upon experimenting a little it looks like it can't. (I had thought of
something along the lines of

        li      t0, .L_primary_switched - start
        li      t1, XEN_VIRT_START
        add     t0, t0, t1

or even

        li      t1, XEN_VIRT_START
        add     t0, t1, %pcrel_lo(.L_primary_switched - start)

.)

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.