[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 1/3] xen/riscv: implement software page table walking




On 1/28/25 9:14 AM, Jan Beulich wrote:
On 27.01.2025 18:22, Oleksii Kurochko wrote:
On 1/27/25 1:57 PM, Jan Beulich wrote:
On 27.01.2025 13:29, Oleksii Kurochko wrote:
On 1/27/25 11:06 AM, Jan Beulich wrote:
On 20.01.2025 17:54, Oleksii Kurochko wrote:
RISC-V doesn't have hardware feature to ask MMU to translate
virtual address to physical address ( like Arm has, for example ),
so software page table walking in implemented.

Signed-off-by: Oleksii Kurochko<oleksii.kurochko@xxxxxxxxx>
---
   xen/arch/riscv/include/asm/mm.h |  2 ++
   xen/arch/riscv/pt.c             | 56 +++++++++++++++++++++++++++++++++
   2 files changed, 58 insertions(+)

diff --git a/xen/arch/riscv/include/asm/mm.h b/xen/arch/riscv/include/asm/mm.h
index 292aa48fc1..d46018c132 100644
--- a/xen/arch/riscv/include/asm/mm.h
+++ b/xen/arch/riscv/include/asm/mm.h
@@ -15,6 +15,8 @@
   
   extern vaddr_t directmap_virt_start;
   
+paddr_t pt_walk(vaddr_t va);
In the longer run, is returning just the PA really going to be sufficient?
If not, perhaps say a word on the limitation in the description.
In the long run, this function's prototype looks like|paddr_t pt_walk(vaddr_t root, vaddr_t va, bool is_xen)| [1]. However, I'm not sure if it will stay that way,
as I think|is_xen| could be skipped, since using|map_table()| should be sufficient (as it now considers|system_state|) and I'm not really sure if I need root argument
as initial goal was to use this function for debug only purposes and I've never used it for guest page table (stage-1) walking.
Anyway, yes, it is still returning a physical address, and that seems enough to me.

Could you share your thoughts on what I should take into account for returning value, probably, I am missing something really useful?
Often you care about the permissions as well. Sometimes it may even be relevant
to know the (super-)page size of the mapping.
Perhaps it would be better to change the prototype to:
   bool pt_walk(vaddr_t va, mfn_t *ret_pa);
or even
   void pt_walk(vaddr_t va, mfn_t *ret_pa);
   In this case,|ret_pa = INVALID_MFN| could serve as a signal that|pt_walk()| failed.
If there's a need to return permissions or (super-)page size in the future, another argument could be added.

What do you think? Would this approach be better?

I am also considering returning a structure containing the|mfn| (or|paddr_t|) and adding other properties (such as permissions or
page size) as needed in the future. Both solutions seem more or less equivalent.
Imo the most natural thing for a page walking function would be to return the
leaf PTE (or the leaf-most not-present [or otherwise "no-access"] one). That
would provide (almost) all possible information to the caller. "Almost"
because depending on how page walk works, permissions may combine across page
table levels. Yet then (see also the "no-access" above) this would also
require further input, to specify the context for which the translation is
being seeked. For example, the intention to write may want to yield no valid
PTE when there are present ones down to the leaf, but effective permissions
say "read-only".
Perhaps returning the leaf PTE could be a really good option.
I'm not entirely sure I understand what you mean by "leaf-most not-present". Could you please try to explain this moment one more time?
My expectation was that the function should return an existing leaf PTE (from which "access" rights could be determined)
or NULL to indicate that no leaf PTE was found.
Another thing I'm curious about is whether this would be sufficient for determining the level.
It seems clear that, given a PTE and a virtual address, we could compute:
  mask = VA | paddr_from_pte(pte)
Then, iterating through each level, we could apply and understand on which one level it was mapped:
  mask & (BIT(XEN_PT_LEVEL_ORDER(i), UL) - 1).
If I haven't overlooked any other way to calculate the page table level, would it be better to simply add another argument
to pt_walk() to return the level.
Thanks.


~ Oleksii

Jan

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.