[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 10/18] xen/riscv: implement p2m_set_range()



On 17.09.2025 23:55, Oleksii Kurochko wrote:
> --- a/xen/arch/riscv/p2m.c
> +++ b/xen/arch/riscv/p2m.c
> @@ -16,6 +16,7 @@
>  #include <asm/riscv_encoding.h>
>  
>  unsigned long __ro_after_init gstage_mode;
> +unsigned int __ro_after_init gstage_root_level;
>  
>  void __init gstage_mode_detect(void)
>  {
> @@ -53,6 +54,7 @@ void __init gstage_mode_detect(void)
>          if ( MASK_EXTR(csr_read(CSR_HGATP), HGATP_MODE_MASK) == mode )
>          {
>              gstage_mode = mode;
> +            gstage_root_level = modes[mode_idx].paging_levels - 1;
>              break;
>          }
>      }
> @@ -210,6 +212,9 @@ int p2m_init(struct domain *d)
>      rwlock_init(&p2m->lock);
>      INIT_PAGE_LIST_HEAD(&p2m->pages);
>  
> +    p2m->max_mapped_gfn = _gfn(0);
> +    p2m->lowest_mapped_gfn = _gfn(ULONG_MAX);
> +
>      /*
>       * Currently, the infrastructure required to enable 
> CONFIG_HAS_PASSTHROUGH
>       * is not ready for RISC-V support.
> @@ -251,13 +256,287 @@ int p2m_set_allocation(struct domain *d, unsigned long 
> pages, bool *preempted)
>      return rc;
>  }
>  
> +/*
> + * Find and map the root page table. The caller is responsible for
> + * unmapping the table.

With the root table being 4 pages, "the root table" is slightly misleading
here: Yu never map the entire table.

> + * The function will return NULL if the offset into the root table is
> + * invalid.
> + */
> +static pte_t *p2m_get_root_pointer(struct p2m_domain *p2m, gfn_t gfn)
> +{
> +    unsigned long root_table_indx;
> +
> +    root_table_indx = gfn_x(gfn) >> P2M_LEVEL_ORDER(P2M_ROOT_LEVEL);
> +    if ( root_table_indx >= P2M_ROOT_PAGES )
> +        return NULL;
> +
> +    /*
> +     * The P2M root page table is extended by 2 bits, making its size 16KB
> +     * (instead of 4KB for non-root page tables). Therefore, p2m->root is
> +     * allocated as four consecutive 4KB pages (since alloc_domheap_pages()
> +     * only allocates 4KB pages).
> +     *
> +     * To determine which of these four 4KB pages the root_table_indx falls
> +     * into, we divide root_table_indx by
> +     * P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL - 1).
> +     */
> +    root_table_indx /= P2M_PAGETABLE_ENTRIES(P2M_ROOT_LEVEL - 1);

The subtraction of 1 here feels odd: You're after the root table's
number of entries, i.e. I'd expect you to pass just P2M_ROOT_LEVEL.
And the way P2M_PAGETABLE_ENTRIES() works also suggests so.

> +/*
> + * Insert an entry in the p2m. This should be called with a mapping
> + * equal to a page/superpage.
> + */

I don't follow this comment: There isn't any mapping being passed in, is there?

> +static int p2m_set_entry(struct p2m_domain *p2m,
> +                           gfn_t gfn,
> +                           unsigned long page_order,
> +                           mfn_t mfn,
> +                           p2m_type_t t)

Nit: Indentation.

> +{
> +    unsigned int level;
> +    unsigned int target = page_order / PAGETABLE_ORDER;
> +    pte_t *entry, *table, orig_pte;
> +    int rc;
> +    /*
> +     * A mapping is removed only if the MFN is explicitly set to INVALID_MFN.
> +     * Other MFNs that are considered invalid by mfn_valid() (e.g., MMIO)
> +     * are still allowed.
> +     */
> +    bool removing_mapping = mfn_eq(mfn, INVALID_MFN);
> +    DECLARE_OFFSETS(offsets, gfn_to_gaddr(gfn));
> +
> +    ASSERT(p2m_is_write_locked(p2m));
> +
> +    /*
> +     * Check if the level target is valid: we only support
> +     * 4K - 2M - 1G mapping.
> +     */
> +    ASSERT(target <= 2);
> +
> +    table = p2m_get_root_pointer(p2m, gfn);
> +    if ( !table )
> +        return -EINVAL;
> +
> +    for ( level = P2M_ROOT_LEVEL; level > target; level-- )
> +    {
> +        /*
> +         * Don't try to allocate intermediate page table if the mapping
> +         * is about to be removed.
> +         */
> +        rc = p2m_next_level(p2m, !removing_mapping,
> +                            level, &table, offsets[level]);
> +        if ( (rc == P2M_TABLE_MAP_NONE) || (rc == P2M_TABLE_MAP_NOMEM) )
> +        {
> +            rc = (rc == P2M_TABLE_MAP_NONE) ? -ENOENT : -ENOMEM;
> +            /*
> +             * We are here because p2m_next_level has failed to map
> +             * the intermediate page table (e.g the table does not exist
> +             * and they p2m tree is read-only).

I thought I commented on this or something similar already: Calling the
p2m tree "read-only" is imo misleading.

> It is a valid case
> +             * when removing a mapping as it may not exist in the
> +             * page table. In this case, just ignore it.

I fear the "it" has no reference; aiui you mean "ignore the lookup failure",
but the comment isn't worded to refer to that by "it".

> +             */
> +            rc = removing_mapping ? 0 : rc;
> +            goto out;
> +        }
> +
> +        if ( rc != P2M_TABLE_NORMAL )
> +            break;
> +    }
> +
> +    entry = table + offsets[level];
> +
> +    /*
> +     * If we are here with level > target, we must be at a leaf node,
> +     * and we need to break up the superpage.
> +     */
> +    if ( level > target )
> +    {
> +        panic("Shattering isn't implemented\n");
> +    }
> +
> +    /*
> +     * We should always be there with the correct level because all the
> +     * intermediate tables have been installed if necessary.
> +     */
> +    ASSERT(level == target);
> +
> +    orig_pte = *entry;
> +
> +    if ( removing_mapping )
> +        p2m_clean_pte(entry, p2m->clean_dcache);
> +    else
> +    {
> +        pte_t pte = p2m_pte_from_mfn(mfn, t);
> +
> +        p2m_write_pte(entry, pte, p2m->clean_dcache);
> +
> +        p2m->max_mapped_gfn = gfn_max(p2m->max_mapped_gfn,
> +                                      gfn_add(gfn, BIT(page_order, UL) - 1));
> +        p2m->lowest_mapped_gfn = gfn_min(p2m->lowest_mapped_gfn, gfn);
> +    }
> +
> +    p2m->need_flush = true;
> +
> +    /*
> +     * Currently, the infrastructure required to enable 
> CONFIG_HAS_PASSTHROUGH
> +     * is not ready for RISC-V support.
> +     *
> +     * When CONFIG_HAS_PASSTHROUGH=y, iommu_iotlb_flush() should be done
> +     * here.
> +     */
> +#ifdef CONFIG_HAS_PASSTHROUGH
> +#   error "add code to flush IOMMU TLB"
> +#endif
> +
> +    rc = 0;
> +
> +    /*
> +     * Free the entry only if the original pte was valid and the base
> +     * is different (to avoid freeing when permission is changed).
> +     *
> +     * If previously MFN 0 was mapped and it is going to be removed
> +     * and considering that during removing MFN 0 is used then `entry`
> +     * and `new_entry` will be the same and p2m_free_subtree() won't be
> +     * called. This case is handled explicitly.
> +     */
> +    if ( pte_is_valid(orig_pte) &&
> +         (!mfn_eq(pte_get_mfn(*entry), pte_get_mfn(orig_pte)) ||
> +          (removing_mapping && mfn_eq(pte_get_mfn(*entry), _mfn(0)))) )
> +        p2m_free_subtree(p2m, orig_pte, level);

I continue to fail to understand why the MFN would matter here. Isn't the
need to free strictly tied to a VALID -> NOT VALID transition? A permission
change simply retains the VALID state of an entry.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.