[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 1/3] xen/riscv: implement software page table walking




On 1/27/25 11:06 AM, Jan Beulich wrote:
On 20.01.2025 17:54, Oleksii Kurochko wrote:
RISC-V doesn't have hardware feature to ask MMU to translate
virtual address to physical address ( like Arm has, for example ),
so software page table walking in implemented.

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
---
 xen/arch/riscv/include/asm/mm.h |  2 ++
 xen/arch/riscv/pt.c             | 56 +++++++++++++++++++++++++++++++++
 2 files changed, 58 insertions(+)

diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h
index 292aa48fc1..d46018c132 100644
--- a/xen/arch/riscv/include/asm/mm.h
+++ b/xen/arch/riscv/include/asm/mm.h
@@ -15,6 +15,8 @@
 
 extern vaddr_t directmap_virt_start;
 
+paddr_t pt_walk(vaddr_t va);
In the longer run, is returning just the PA really going to be sufficient?
If not, perhaps say a word on the limitation in the description.
In the long run, this function's prototype looks like paddr_t pt_walk(vaddr_t root, vaddr_t va, bool is_xen) [1]. However, I'm not sure if it will stay that way,
as I think is_xen could be skipped, since using map_table() should be sufficient (as it now considers system_state) and I'm not really sure if I need root argument
as initial goal was to use this function for debug only purposes and I've never used it for guest page table (stage-1) walking.
Anyway, yes, it is still returning a physical address, and that seems enough to me.
Could you share your thoughts on what I should take into account for returning value, probably, I am missing something really useful?
[1] https://gitlab.com/xen-project/people/olkur/xen/-/blob/latest/xen/arch/riscv/mm.c#L820
+     * initialize `ret` with "impossible" XEN_TABLE_MAP_NOMEM.
+    */
+    int ret = XEN_TABLE_MAP_NOMEM;
+    unsigned int level = HYP_PT_ROOT_LEVEL;
+    paddr_t pa = 0;
Seeing 0 as initializer here, just to double check: You do prevent MFN 0
to be handed to the page allocator, and you also don't use it "manually"
anywhere?
MFN 0 could be used when removing the page: https://gitlab.com/xen-project/xen/-/blob/staging/xen/arch/riscv/pt.c?ref_type=heads#L251.
In that case, it would be better to initialize pa with (paddr_t)(-1), as this value couldn't be mapped and is safe to use as an invalid value.

+    pte_t *table;
+
+    DECLARE_OFFSETS(offsets, va);
+
+    table = map_table(root);
+
+    /*
+     * Find `pa` of an entry which corresponds to `va` by iterating for each
+     * page level and checking if the entry points to a next page table or
+     * to a page.
+     *
+     * Two cases are possible:
+     * - ret == XEN_TABLE_SUPER_PAGE means that the entry was find;
+     *   (Despite of the name) XEN_TABLE_SUPER_PAGE covers 4k mapping too.
+     * - ret == XEN_TABLE_MAP_NONE means that requested `va` wasn't actually
+     *   mapped.
+     */
+    while ( (ret != XEN_TABLE_MAP_NONE) && (ret != XEN_TABLE_SUPER_PAGE) )
+    {
+        /*
+         * This case shouldn't really occur as it will mean that for table
+         * level 0 a pointer to next page table has been written, but at
+         * level 0 it could be only a pointer to 4k page.
+         */
+        ASSERT(level <= HYP_PT_ROOT_LEVEL);
+
+        ret = pt_next_level(false, &table, offsets[level]);
+        level--;
+    }
+
+    if ( ret == XEN_TABLE_MAP_NONE )
+        dprintk(XENLOG_WARNING, "Is va(%#lx) really mapped?\n", va);
Even if it's a dprintk(), I'd recommend against adding such.

+    else if ( ret == XEN_TABLE_SUPER_PAGE )
+        pa = pte_to_paddr(*(table + offsets[level + 1]));
Missing "else if ( ret == XEN_TABLE_NORMAL )" (or maybe simply "else")?
If I am not missing something, we can't be here with ret == XEN_TABLE_NORMAL because we iterating
in the while loop above until we don't find a leaf or until reach level = 0. If we find a leaf then
XEN_TABLE_SUPER_PAGE is returned; otherwise sooner or later we should face a case when next table
(in case when `level`=0 and someone put at this level a pointer to next level, what is a bug) should
be allocated in pt_next_level(), but it will fail because `alloc_tbl`=false is passed to pt_next_level()
and thereby ret=XEN_TABLE_MAP_NONE() will be returned.

Based on your previous comment about dprintk this could could be re-written in the following way:
-    if ( ret == XEN_TABLE_MAP_NONE )
-        dprintk(XENLOG_WARNING, "Is va(%#lx) really mapped?\n", va);
-    else if ( ret == XEN_TABLE_SUPER_PAGE )
+    if ( ret != XEN_TABLE_MAP_NONE )
         pa = pte_to_paddr(*(table + offsets[level + 1]));

+    return pa;
Don't you want to OR in the low 12 bits of VA here (unless "pa" is still 0)?
It is a bug, and IIUC if `pa` isn't 0 it is still need to add the low bits of VA to `pa`:
    return pa | (va & ((1 << PAGE_SHIFT) - 1));
(I think that I saw somewhere a macros to generate masks but can't find where)

Thanks.

~ Oleksii

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.