[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 1/3] xen/riscv: implement software page table walking


  • To: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 27 Jan 2025 13:57:30 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Alistair Francis <alistair.francis@xxxxxxx>, Bob Eshleman <bobbyeshleman@xxxxxxxxx>, Connor Davis <connojdavis@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 27 Jan 2025 12:57:33 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 27.01.2025 13:29, Oleksii Kurochko wrote:
> On 1/27/25 11:06 AM, Jan Beulich wrote:
>> On 20.01.2025 17:54, Oleksii Kurochko wrote:
>>> RISC-V doesn't have hardware feature to ask MMU to translate
>>> virtual address to physical address ( like Arm has, for example ),
>>> so software page table walking in implemented.
>>>
>>> Signed-off-by: Oleksii Kurochko<oleksii.kurochko@xxxxxxxxx>
>>> ---
>>>   xen/arch/riscv/include/asm/mm.h |  2 ++
>>>   xen/arch/riscv/pt.c             | 56 +++++++++++++++++++++++++++++++++
>>>   2 files changed, 58 insertions(+)
>>>
>>> diff --git a/xen/arch/riscv/include/asm/mm.h 
>>> b/xen/arch/riscv/include/asm/mm.h
>>> index 292aa48fc1..d46018c132 100644
>>> --- a/xen/arch/riscv/include/asm/mm.h
>>> +++ b/xen/arch/riscv/include/asm/mm.h
>>> @@ -15,6 +15,8 @@
>>>   
>>>   extern vaddr_t directmap_virt_start;
>>>   
>>> +paddr_t pt_walk(vaddr_t va);
>> In the longer run, is returning just the PA really going to be sufficient?
>> If not, perhaps say a word on the limitation in the description.
> 
> In the long run, this function's prototype looks like|paddr_t pt_walk(vaddr_t 
> root, vaddr_t va, bool is_xen)| [1]. However, I'm not sure if it will stay 
> that way,
> as I think|is_xen| could be skipped, since using|map_table()| should be 
> sufficient (as it now considers|system_state|) and I'm not really sure if I 
> need root argument
> as initial goal was to use this function for debug only purposes and I've 
> never used it for guest page table (stage-1) walking.
> Anyway, yes, it is still returning a physical address, and that seems enough 
> to me.
> 
> Could you share your thoughts on what I should take into account for 
> returning value, probably, I am missing something really useful?

Often you care about the permissions as well. Sometimes it may even be relevant
to know the (super-)page size of the mapping.

> |[1] 
> https://gitlab.com/xen-project/people/olkur/xen/-/blob/latest/xen/arch/riscv/mm.c#L820|
> 
>>> +     * initialize `ret` with "impossible" XEN_TABLE_MAP_NOMEM.
>>> +    */
>>> +    int ret = XEN_TABLE_MAP_NOMEM;
>>> +    unsigned int level = HYP_PT_ROOT_LEVEL;
>>> +    paddr_t pa = 0;
>> Seeing 0 as initializer here, just to double check: You do prevent MFN 0
>> to be handed to the page allocator, and you also don't use it "manually"
>> anywhere?
> 
> MFN 0 could be used when removing the 
> page:https://gitlab.com/xen-project/xen/-/blob/staging/xen/arch/riscv/pt.c?ref_type=heads#L251
>  
> <https://gitlab.com/xen-project/xen/-/blob/staging/xen/arch/riscv/pt.c?ref_type=heads#L251>.
> 
> In that case, it would be better to initialize|pa| with|(paddr_t)(-1)|, as 
> this value couldn't be mapped and is safe to use as an invalid value.
> 
>>
>>> +    pte_t *table;
>>> +
>>> +    DECLARE_OFFSETS(offsets, va);
>>> +
>>> +    table = map_table(root);
>>> +
>>> +    /*
>>> +     * Find `pa` of an entry which corresponds to `va` by iterating for 
>>> each
>>> +     * page level and checking if the entry points to a next page table or
>>> +     * to a page.
>>> +     *
>>> +     * Two cases are possible:
>>> +     * - ret == XEN_TABLE_SUPER_PAGE means that the entry was find;
>>> +     *   (Despite of the name) XEN_TABLE_SUPER_PAGE covers 4k mapping too.
>>> +     * - ret == XEN_TABLE_MAP_NONE means that requested `va` wasn't 
>>> actually
>>> +     *   mapped.
>>> +     */
>>> +    while ( (ret != XEN_TABLE_MAP_NONE) && (ret != XEN_TABLE_SUPER_PAGE) )
>>> +    {
>>> +        /*
>>> +         * This case shouldn't really occur as it will mean that for table
>>> +         * level 0 a pointer to next page table has been written, but at
>>> +         * level 0 it could be only a pointer to 4k page.
>>> +         */
>>> +        ASSERT(level <= HYP_PT_ROOT_LEVEL);
>>> +
>>> +        ret = pt_next_level(false, &table, offsets[level]);
>>> +        level--;
>>> +    }
>>> +
>>> +    if ( ret == XEN_TABLE_MAP_NONE )
>>> +        dprintk(XENLOG_WARNING, "Is va(%#lx) really mapped?\n", va);
>> Even if it's a dprintk(), I'd recommend against adding such.
>>
>>> +    else if ( ret == XEN_TABLE_SUPER_PAGE )
>>> +        pa = pte_to_paddr(*(table + offsets[level + 1]));
>> Missing "else if ( ret == XEN_TABLE_NORMAL )" (or maybe simply "else")?
> 
> If I am not missing something, we can't be here with ret == XEN_TABLE_NORMAL 
> because we iterating
> in the while loop above until we don't find a leaf or until reach level = 0.

I'll admit that I didn't specifically check whether XEN_TABLE_NORMAL could
be observed here; my point was that non-super-page mappings aren't handled,
as you ...

> If we find a leaf then
> XEN_TABLE_SUPER_PAGE is returned; otherwise sooner or later we should face a 
> case when next table
> (in case when `level`=0 and someone put at this level a pointer to next 
> level, what is a bug) should
> be allocated in pt_next_level(), but it will fail because `alloc_tbl`=false 
> is passed to pt_next_level()
> and thereby ret=XEN_TABLE_MAP_NONE() will be returned.
> 
> Based on your previous comment about dprintk this could could be re-written 
> in the following way:
> -    if ( ret == XEN_TABLE_MAP_NONE )
> -        dprintk(XENLOG_WARNING, "Is va(%#lx) really mapped?\n", va);
> -    else if ( ret == XEN_TABLE_SUPER_PAGE )
> +    if ( ret != XEN_TABLE_MAP_NONE )
>           pa = pte_to_paddr(*(table + offsets[level + 1]));

... appear to confirm here.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.