[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 12/17] xen/riscv: Implement p2m_free_entry() and related helpers
On 11.07.2025 17:56, Oleksii Kurochko wrote: > On 7/1/25 4:23 PM, Jan Beulich wrote: >> On 10.06.2025 15:05, Oleksii Kurochko wrote: >>> +/* >>> + * In the case of the P2M, the valid bit is used for other purpose. Use >>> + * the type to check whether an entry is valid. >>> + */ >>> static inline bool p2me_is_valid(struct p2m_domain *p2m, pte_t pte) >>> { >>> - panic("%s: isn't implemented for now\n", __func__); >>> + return p2m_type_radix_get(p2m, pte) != p2m_invalid; >>> +} >> No checking of the valid bit? > > As mentioned in the comment, only the P2M type should be checked, since the > valid bit is used for other purposes we discussed earlier, for example, to > track whether pages were accessed by a guest domain, or to support certain > table invalidation optimizations (1) and (2). > So, in this case, we only need to consider whether the entry is invalid > from the P2M perspective. > > (1)https://github.com/xen-project/xen/blob/19772b67/xen/arch/arm/mmu/p2m.c#L1245 > (2)https://github.com/xen-project/xen/blob/19772b67/xen/arch/arm/mmu/p2m.c#L1386 And there can be e.g. entries with the valid bit set and the type being p2m_invalid? IOW there's no short-circuiting possible in any of the possible cases, avoiding the radix tree lookup in at least some of the cases? >>> @@ -404,11 +426,127 @@ static int p2m_next_level(struct p2m_domain *p2m, >>> bool alloc_tbl, >>> return GUEST_TABLE_MAP_NONE; >>> } >>> >>> +static void p2m_put_foreign_page(struct page_info *pg) >>> +{ >>> + /* >>> + * It's safe to do the put_page here because page_alloc will >>> + * flush the TLBs if the page is reallocated before the end of >>> + * this loop. >>> + */ >>> + put_page(pg); >> Is the comment really true? The page allocator will flush the normal >> TLBs, but not the stage-2 ones. Yet those are what you care about here, >> aiui. > > In alloc_heap_pages(): > ... > if ( need_tlbflush ) > filtered_flush_tlb_mask(tlbflush_timestamp); > ... > > filtered_flush_tlb_mask() calls arch_flush_tlb_mask(). > > and arch_flush_tlb_mask(), at least, on Arm (I haven't checked x86) is > implented as: > void arch_flush_tlb_mask(const cpumask_t *mask) > { > /* No need to IPI other processors on ARM, the processor takes care of > it. */ > flush_all_guests_tlb(); > } > > So it flushes stage-2 TLB. Hmm, okay. And I take it you have the same plan on RISC-V? What I'd like to ask for, though, is that the comment (also) mentions where that (guest) flushing actually happens. That's not in page_alloc.c, and it also wasn't originally intended for guest TLBs to also be flushed from there (as x86 is where the flush avoidance machinery originates, which Arm and now also RISC-V don't really use). >>> +/* Put any references on the single 4K page referenced by mfn. */ >>> +static void p2m_put_4k_page(mfn_t mfn, p2m_type_t type) >>> +{ >>> + /* TODO: Handle other p2m types */ >>> + >>> + /* Detect the xenheap page and mark the stored GFN as invalid. */ >>> + if ( p2m_is_ram(type) && is_xen_heap_mfn(mfn) ) >>> + page_set_xenheap_gfn(mfn_to_page(mfn), INVALID_GFN); >> Is this a valid thing to do? How do you make sure the respective uses >> (in gnttab's shared and status page arrays) are / were also removed? > > As grant table frame GFN is stored directly in struct page_info instead > of keeping it in standalone status/shared arrays, thereby there is no need > for status/shared arrays. I fear I don't follow. Looking at Arm's header (which I understand you derive from), I see #define gnttab_shared_page(t, i) virt_to_page((t)->shared_raw[i]) #define gnttab_status_page(t, i) virt_to_page((t)->status[i]) Are you intending to do things differently? >>> +/* Put any references on the superpage referenced by mfn. */ >>> +static void p2m_put_2m_superpage(mfn_t mfn, p2m_type_t type) >>> +{ >>> + struct page_info *pg; >>> + unsigned int i; >>> + >>> + ASSERT(mfn_valid(mfn)); >>> + >>> + pg = mfn_to_page(mfn); >>> + >>> + for ( i = 0; i < XEN_PT_ENTRIES; i++, pg++ ) >>> + p2m_put_foreign_page(pg); >>> +} >>> + >>> +/* Put any references on the page referenced by pte. */ >>> +static void p2m_put_page(struct p2m_domain *p2m, const pte_t pte, >>> + unsigned int level) >>> +{ >>> + mfn_t mfn = pte_get_mfn(pte); >>> + p2m_type_t p2m_type = p2m_type_radix_get(p2m, pte); >> This gives you the type of the 1st page. What guarantees that all other pages >> in a superpage are of the exact same type? > > Doesn't superpage mean that all the 4KB pages within that superpage have the > same type and contiguous in memory? If the mapping is a super-page one - yes. Yet I see nothing super-page-ish here. >>> + if ( level == 1 ) >>> + return p2m_put_2m_superpage(mfn, p2m_type); >>> + else if ( level == 0 ) >>> + return p2m_put_4k_page(mfn, p2m_type); >> Use switch() right away? > > It could be, I think that no big difference at the moment, at least. > But I am okay to rework it. If you don't want to use switch() here, then my other style nit would need giving: Please avoid "else" in situations like this. >>> +static void p2m_free_page(struct domain *d, struct page_info *pg) >>> +{ >>> + if ( is_hardware_domain(d) ) >>> + free_domheap_page(pg); >> Why's the hardware domain different here? It should have a pool just like >> all other domains have. > > Hardware domain (dom0) should be no limit in the number of pages that can > be allocated, so allocate p2m pages for hardware domain is done from heap. > > An idea of p2m pool is to provide a way how to put clear limit and amount > to the p2m allocation. Well, we had been there on another thread, and I outlined how I think Dom0 may want handling. >>> /* Free pte sub-tree behind an entry */ >>> static void p2m_free_entry(struct p2m_domain *p2m, >>> pte_t entry, unsigned int level) >>> { >>> - panic("%s: hasn't been implemented yet\n", __func__); >>> + unsigned int i; >>> + pte_t *table; >>> + mfn_t mfn; >>> + struct page_info *pg; >>> + >>> + /* Nothing to do if the entry is invalid. */ >>> + if ( !p2me_is_valid(p2m, entry) ) >>> + return; >> Does this actually apply to intermediate page tables (which you handle >> later in the function), when that's (only) a P2M type check? > > Yes, any PTE should have V bit set to 1, so from P2M perspective it also > should be, at least, not equal to p2m_invalid. I don't follow. Where would that type be set? The radix tree being GFN- indexed, you would need to "invent" a GFN for every intermediate page table, just to be able to (legitimately) invoke the type retrieval function. Maybe you mean to leverage that (now, i.e. post-v2) you encode some of the types directly in the PTE, and p2m_invalid may be one of them. But that wasn't the case in the v2 submission, and hence the code looked wrong to me. Which in turn suggests that at least some better commentary is going to be needed, maybe even some BUILD_BUG_ON(). Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |