[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration

To: Jan Beulich <jbeulich@xxxxxxxx>
From: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>
Date: Fri, 18 Jul 2025 11:52:31 +0200
Cc: Alistair Francis <alistair.francis@xxxxxxx>, Bob Eshleman <bobbyeshleman@xxxxxxxxx>, Connor Davis <connojdavis@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 18 Jul 2025 09:52:43 +0000
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 7/17/25 12:25 PM, Jan Beulich wrote:

On 17.07.2025 10:56, Oleksii Kurochko wrote:

On 7/16/25 6:18 PM, Jan Beulich wrote:

On 16.07.2025 18:07, Oleksii Kurochko wrote:

On 7/16/25 1:31 PM, Jan Beulich wrote:

On 15.07.2025 16:47, Oleksii Kurochko wrote:

On 7/1/25 5:08 PM, Jan Beulich wrote:

On 10.06.2025 15:05, Oleksii Kurochko wrote:

--- a/xen/arch/riscv/p2m.c
+++ b/xen/arch/riscv/p2m.c
@@ -345,6 +345,26 @@ static pte_t *p2m_get_root_pointer(struct p2m_domain *p2m, gfn_t gfn)
        return __map_domain_page(p2m->root + root_table_indx);
    }
    
+static int p2m_type_radix_set(struct p2m_domain *p2m, pte_t pte, p2m_type_t t)

See comments on the earlier patch regarding naming.

+{
+    int rc;
+    gfn_t gfn = mfn_to_gfn(p2m->domain, mfn_from_pte(pte));

How does this work, when you record GFNs only for Xenheap pages?

I think I don't understand what is an issue. Could you please provide
some extra details?

Counter question: The mfn_to_gfn() you currently have is only a stub. It only
works for 1:1 mapped domains. Can you show me the eventual final implementation
of the function, making it possible to use it here?

At the moment, I planned to support only 1:1 mapped domains, so it is final
implementation.

Isn't that on overly severe limitation?

I wouldn't say that it's a severe limitation, as it's just a matter of how
|mfn_to_gfn()| is implemented. When non-1:1 mapped domains are supported,
|mfn_to_gfn()| can be implemented differently, while the code where it’s called
will likely remain unchanged.

What I meant in my reply is that, for the current state and current limitations,
this is the final implementation of|mfn_to_gfn()|. But that doesn't mean I don't
see the value in, or the need for, non-1:1 mapped domains—it's just that this
limitation simplifies development at the current stage of the RISC-V port.

Simplification is fine in some cases, but not supporting the "normal" way of
domain construction looks like a pretty odd restriction. I'm also curious
how you envision to implement mfn_to_gfn() then, suitable for generic use like
the one here. Imo, current limitation or not, you simply want to avoid use of
that function outside of the special gnttab case.

In this context (not sure if I asked before): With this use of a radix tree,
how do you intend to bound the amount of memory that a domain can use, by
making Xen insert very many entries?

I didn’t think about that. I assumed it would be enough to set the amount of
memory a guest domain can use by specifying|xen,domain-p2m-mem-mb| in the DTS,
or using some predefined value if|xen,domain-p2m-mem-mb| isn’t explicitly set.

Which would require these allocations to come from that pool.

Yes, and it is true only for non-hardware domains with the current implementation.

???

I meant that pool is used now only for non-hardware domains at the moment.

And how does this matter here? The memory required for the radix tree doesn't
come from that pool anyway.

I thought that is possible to do that somehow, but looking at a code of
radix-tree.c it seems like the only one way to allocate memroy for the radix
tree is radix_tree_node_alloc() -> xzalloc(struct rcu_node).

Then it is needed to introduce radix_tree_node_allocate(domain) or radix tree
can't be used at all for mentioned in the previous replies security reason, no?

Also, it seems this would just lead to the issue you mentioned earlier: when
the memory runs out,|domain_crash()| will be called or PTE will be zapped.

Or one domain exhausting memory would cause another domain to fail. A domain
impacting just itself may be tolerable. But a domain affecting other domains
isn't.

But it seems like this issue could happen in any implementation. It won't happen only
if we will have only pre-populated pool for any domain type (hardware, control, guest
domain) without ability to extend them or allocate extra pages from domheap in runtime.
Otherwise, if extra pages allocation is allowed then we can't really do something
with this issue.

But that's why I brought this up: You simply have to. Or, as indicated, the
moment you mark Xen security-supported on RISC-V, there will be an XSA needed.

Why it isn't XSA for other architectures? At least, Arm then should have such
XSA.

Does Arm use a radix tree for storing types? It uses one for mem-access, but
it's not clear to me whether that's actually a supported feature.

I don't understand why x86 won't have the same issue. Memory is the limited
and shared resource, so if one of the domain will use to much memory then it could
happen that other domains won't have enough memory for its purpose...

The question is whether allocations are bounded. With this use of a radix tree,
you give domains a way to have Xen allocate pretty much arbitrary amounts of
memory to populate that tree. That unbounded-ness is the problem, not memory
allocations in general.

Isn't radix tree key bounded to an amount of GFNs given for a domain? We can't have
more keys then a max GFN number for a domain. So a potential amount of necessary memory
for radix tree is also bounded to an amount of GFNs.

Anyway, IIUC I just can't use radix tree for p2m types at all, right?
If yes, does it make sense to borrow 2 bits from struct page_info->type_info as now it
is used 9-bits for count of a frame?
So we will 7-bit reference counter, 2 bits for p2m types in type_info + 2 bits in PTE
what in general will give us 16 p2m types.

~ Oleksii

Follow-Ups:
- Re: [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration
  - From: Jan Beulich

References:
- Re: [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration
  - From: Jan Beulich
- Re: [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration
  - From: Oleksii Kurochko
- Re: [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration
  - From: Jan Beulich
- Re: [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration
  - From: Oleksii Kurochko
- Re: [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration
  - From: Jan Beulich
- Re: [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration
  - From: Oleksii Kurochko
- Re: [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration
  - From: Jan Beulich

Prev by Date: Re: [PATCH 2/2] x86: generalise vcpu0 creation for a domain
Next by Date: [PATCH v1.1 6/6] x86/apic: Convert the TSC deadline errata table to X86_MATCH_*()
Previous by thread: Re: [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration
Next by thread: Re: [PATCH v2 13/17] xen/riscv: Implement p2m_entry_from_mfn() and support PBMT configuration
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.