[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 17/18] xen/riscv: add support of page lookup by GFN
On 9/22/25 10:46 PM, Jan Beulich wrote:
On 17.09.2025 23:55, Oleksii Kurochko wrote:--- a/xen/arch/riscv/p2m.c +++ b/xen/arch/riscv/p2m.c @@ -978,3 +978,189 @@ int map_regions_p2mt(struct domain *d, return rc; } + +Nit: No double blank lines please.+/* + * p2m_get_entry() should always return the correct order value, even if an + * entry is not present (i.e. the GFN is outside the range): + * [p2m->lowest_mapped_gfn, p2m->max_mapped_gfn]). (1) + * + * This ensures that callers of p2m_get_entry() can determine what range of + * address space would be altered by a corresponding p2m_set_entry(). + * Also, it would help to avoid cost page walks for GFNs outside range (1). + * + * Therefore, this function returns true for GFNs outside range (1), and in + * that case the corresponding level is returned via the level_out argument. + * Otherwise, it returns false and p2m_get_entry() performs a page walk to + * find the proper entry. + */ +static bool check_outside_boundary(gfn_t gfn, gfn_t boundary, bool is_lower, + unsigned int *level_out) +{ + unsigned int level; + + if ( (is_lower && gfn_x(gfn) < gfn_x(boundary)) || + (!is_lower && gfn_x(gfn) > gfn_x(boundary)) )I understand people write things this way, but personally I find it confusing to read. Why not simply use a conditional operator here (and again below): if ( is_lower ? gfn_x(gfn) < gfn_x(boundary) : gfn_x(gfn) > gfn_x(boundary) ) I am okay with both options. If you think the second one is more readable then I will use it. + { + for ( level = P2M_ROOT_LEVEL; level; level-- ) + { + unsigned long mask = PFN_DOWN(P2M_LEVEL_MASK(level));Don't you need to accumulate the mask to use across loop iterations here (or calculate it accordingly)? Else ...+ if ( (is_lower && ((gfn_x(gfn) & mask) < gfn_x(boundary))) || + (!is_lower && ((gfn_x(gfn) & mask) > gfn_x(boundary))) )... here you'll compare some middle part of the original GFN against the boundary. Agree, accumulation of the mask should be done here. + { + *level_out = level; + return true; + } + } + } + + return false; +} + +/* + * Get the details of a given gfn. + * + * If the entry is present, the associated MFN will be returned and the + * p2m type of the mapping. + * The page_order will correspond to the order of the mapping in the page + * table (i.e it could be a superpage). + * + * If the entry is not present, INVALID_MFN will be returned and the + * page_order will be set according to the order of the invalid range. + * + * valid will contain the value of bit[0] (e.g valid bit) of the + * entry. + */ +static mfn_t p2m_get_entry(struct p2m_domain *p2m, gfn_t gfn, + p2m_type_t *t, + unsigned int *page_order, + bool *valid) +{ + unsigned int level = 0; + pte_t entry, *table; + int rc; + mfn_t mfn = INVALID_MFN; + DECLARE_OFFSETS(offsets, gfn_to_gaddr(gfn)); + + ASSERT(p2m_is_locked(p2m)); + + if ( valid ) + *valid = false;Wouldn't you better similarly set *t to some "default" value? I think it makes sense. I will set it to p2m_invalid. + if ( check_outside_boundary(gfn, p2m->lowest_mapped_gfn, true, &level) ) + goto out; + + if ( check_outside_boundary(gfn, p2m->max_mapped_gfn, false, &level) ) + goto out; + + table = p2m_get_root_pointer(p2m, gfn); + + /* + * The table should always be non-NULL because the gfn is below + * p2m->max_mapped_gfn and the root table pages are always present. + */ + if ( !table ) + { + ASSERT_UNREACHABLE(); + level = P2M_ROOT_LEVEL; + goto out; + } + + for ( level = P2M_ROOT_LEVEL; level; level-- ) + { + rc = p2m_next_level(p2m, false, level, &table, offsets[level]); + if ( (rc == P2M_TABLE_MAP_NONE) || (rc == P2M_TABLE_MAP_NOMEM) ) + goto out_unmap;Getting back P2M_TABLE_MAP_NOMEM here is a bug, not really a loop exit condition. Oh, I agree. With the second argument set to + if ( rc != P2M_TABLE_NORMAL ) + break; + } + + entry = table[offsets[level]]; + + if ( pte_is_valid(entry) ) + { + if ( t ) + *t = p2m_get_type(entry); + + mfn = pte_get_mfn(entry); + /* + * The entry may point to a superpage. Find the MFN associated + * to the GFN. + */ + mfn = mfn_add(mfn, + gfn_x(gfn) & (BIT(P2M_LEVEL_ORDER(level), UL) - 1));May want to assert that the respective bits of "mfn" are actually clear before this calculation. ASSERT(!(mfn & (BIT(P2M_LEVEL_ORDER(level), UL) - 1))); Do you mean something like that? I am not 100% sure that there is really need for that as page-fault exception is raised if the PA is insufficienlty aligned: Any level of PTE may be a leaf PTE, so in addition to 4 KiB pages, Sv39 supports 2 MiB megapages and 1 GiB gigapages, each of which must be virtually and physically aligned to a boundary equal to its size. A page-fault exception is raised if the physical address is insufficiently aligned. + if ( valid ) + *valid = pte_is_valid(entry); + } + + out_unmap: + unmap_domain_page(table); + + out: + if ( page_order ) + *page_order = P2M_LEVEL_ORDER(level); + + return mfn; +} + +static mfn_t p2m_lookup(struct p2m_domain *p2m, gfn_t gfn, p2m_type_t *t) +{ + mfn_t mfn; + + p2m_read_lock(p2m); + mfn = p2m_get_entry(p2m, gfn, t, NULL, NULL);Seeing the two NULLs here I wonder: What use is the "valid" parameter of that function? `valid` parameter isn't really needed anymore. It was needed when I had a copy of `valid` bit with real (in PTE) valid bit set to 0 to track which one pages are used. I will drop `valid` parameter. And what use is the function here when it doesn't also return the order? It could be used for gfn_to_mfn(), but p2m_get_entry() could be used too, just not need to forget to wrap by p2m_read_(un)lock() each time when p2m_get_entry() is used. Probably, it makes sense to put p2m_read_(un)lock() inside p2m_get_entry(). I think we can leave only p2m_get_entry() and drop p2m_lookup(). IOW I'm not sure having this helper is actually worthwhile. This is even more so that ... + p2m_read_unlock(p2m); + + return mfn; +} + +struct page_info *p2m_get_page_from_gfn(struct p2m_domain *p2m, gfn_t gfn, + p2m_type_t *t) +{ + struct page_info *page; + p2m_type_t p2mt = p2m_invalid; + mfn_t mfn; + + p2m_read_lock(p2m); + mfn = p2m_lookup(p2m, gfn, t);... there's a locking problem here: You cannot acquire a read lock in a nested fashion - that's a recipe for a deadlock when between the first acquire and the 2nd acquire attempt another CPU tries to acquire the lock for writing (which will result in no further readers being allowed in). It wasn't all that long ago that in the security team we actually audited the code base for the absence of such a pattern. Oh, missed such case. Thanks for explanation and review. ~ Oleksii
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |