[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 06/17] xen/riscv: add root page table allocation
On 30.06.2025 18:18, Oleksii Kurochko wrote: > On 6/30/25 5:22 PM, Jan Beulich wrote: >> On 10.06.2025 15:05, Oleksii Kurochko wrote: >>> --- a/xen/arch/riscv/include/asm/p2m.h >>> +++ b/xen/arch/riscv/include/asm/p2m.h >>> @@ -26,6 +26,12 @@ struct p2m_domain { >>> /* Pages used to construct the p2m */ >>> struct page_list_head pages; >>> >>> + /* The root of the p2m tree. May be concatenated */ >>> + struct page_info *root; >>> + >>> + /* Address Translation Table for the p2m */ >>> + paddr_t hgatp; >> Does this really need holding in a struct field? Can't is be re-created at >> any time from "root" above? > > Yes, with the current one implementation, I agree it would be enough only > root. But as you noticed below... > >> And such re-creation is apparently infrequent, >> if happening at all after initial allocation. (But of course I don't know >> what future patches of yours will bring.) This is even more so if ... >> >>> --- a/xen/arch/riscv/include/asm/riscv_encoding.h >>> +++ b/xen/arch/riscv/include/asm/riscv_encoding.h >>> @@ -133,11 +133,13 @@ >>> #define HGATP_MODE_SV48X4 _UL(9) >>> >>> #define HGATP32_MODE_SHIFT 31 >>> +#define HGATP32_MODE_MASK _UL(0x80000000) >>> #define HGATP32_VMID_SHIFT 22 >>> #define HGATP32_VMID_MASK _UL(0x1FC00000) >>> #define HGATP32_PPN _UL(0x003FFFFF) >>> >>> #define HGATP64_MODE_SHIFT 60 >>> +#define HGATP64_MODE_MASK _ULL(0xF000000000000000) >>> #define HGATP64_VMID_SHIFT 44 >>> #define HGATP64_VMID_MASK _ULL(0x03FFF00000000000) >> ... VMID management is going to change as previously discussed, at which >> point the value to put in hgatp will need (partly) re-calculating at certain >> points anyway. > > ... after VMID management will changed to per-CPU base then it will be needed > to update re-calculate hgatp each time vCPU on pCPU is changed. > In this case I prefer to have partially calculated 'hgatp'. But why, when you need to do some recalculation anyway? >>> --- a/xen/arch/riscv/p2m.c >>> +++ b/xen/arch/riscv/p2m.c >>> @@ -41,6 +41,91 @@ void p2m_write_unlock(struct p2m_domain *p2m) >>> write_unlock(&p2m->lock); >>> } >>> >>> +static void clear_and_clean_page(struct page_info *page) >>> +{ >>> + clean_dcache_va_range(page, PAGE_SIZE); >>> + clear_domain_page(page_to_mfn(page)); >>> +} >> A function of this name can, imo, only clear and then clean. Question is why >> it's the other way around, and what the underlying requirement is for the >> cleaning part to be there in the first place. Maybe that's obvious for a >> RISC-V person, but it's entirely non-obvious to me (Arm being different in >> this regard because of running with caches disabled at certain points in >> time). > > You're right, the current name|clear_and_clean_page()| implies that clearing > should come before cleaning, which contradicts the current implementation. > The intent here is to ensure that the page contents are consistent in RAM > (not just in cache) before use by other entities (guests or devices). > > The clean must follow the clear — so yes, the order needs to be reversed. What you don't address though - why's the cleaning needed in the first place? >>> +static struct page_info *p2m_allocate_root(struct domain *d) >>> +{ >>> + struct page_info *page; >>> + unsigned int order = get_order_from_bytes(KB(16)); >> While better than a hard-coded order of 2, this still is lacking. Is there >> a reason there can't be a suitable manifest constant in the header? > > No any specific reason, I just decided not to introduce new definition as > it is going to be used only inside this function. > > I think it will make sense to have in p2m.c: > #define P2M_ROOT_PT_SIZE KB(16) > If it isn't the best one option, then what about to move this defintion > to config.h or asm/p2m.h. It's defined by the hardware, so neither of the two headers looks to be a good fit. Nor is the P2M_ prefix really in line with this being hardware- defined. page.h has various paging-related hw definitions, and riscv_encoding.h may also be a suitable place. There may be other candidates that I'm presently overlooking. >>> + unsigned int nr_pages = _AC(1,U) << order; >> Nit (style): Missing blank after comma. > > I've changed that to BIT(order, U) > >> >>> + /* Return back nr_pages necessary for p2m root table. */ >>> + >>> + if ( ACCESS_ONCE(d->arch.paging.p2m_total_pages) < nr_pages ) >>> + panic("Specify more xen,domain-p2m-mem-mb\n"); >> You shouldn't panic() in anything involved in domain creation. You want to >> return NULL in this case. > > It makes sense in this case just to return NULL. > >> >> Further, to me the use of "more" looks misleading here. Do you perhaps mean >> "larger" or "bigger"? >> >> This also looks to be happening without any lock held. If that's intentional, >> I think the "why" wants clarifying in a code comment. > > Agree, returning back pages necessary for p2m root table should be done under > spin_lock(&d->arch.paging.lock). Which should be acquired at the paging_*() layer then, not at the p2m_*() layer. (As long as you mean to have that separation, that is. See the earlier discussion on that matter.) >>> + for ( unsigned int i = 0; i < nr_pages; i++ ) >>> + { >>> + /* Return memory to domheap. */ >>> + page = page_list_remove_head(&d->arch.paging.p2m_freelist); >>> + if( page ) >>> + { >>> + ACCESS_ONCE(d->arch.paging.p2m_total_pages)--; >>> + free_domheap_page(page); >>> + } >>> + else >>> + { >>> + printk(XENLOG_ERR >>> + "Failed to free P2M pages, P2M freelist is empty.\n"); >>> + return NULL; >>> + } >>> + } >> The reason for doing this may also want to be put in a comment. > > I thought it would be enough the comment above: /* Return back nr_pages > necessary for p2m root table. */ That describes what the code does, but not why. >>> +{ >>> + struct p2m_domain *p2m = p2m_get_hostp2m(d); >>> + >>> + p2m->root = p2m_allocate_root(d); >>> + if ( !p2m->root ) >>> + return -ENOMEM; >>> + >>> + p2m->hgatp = hgatp_from_page(p2m); >>> + >>> + return 0; >>> +} >>> + >>> static spinlock_t vmid_alloc_lock = SPIN_LOCK_UNLOCKED; >>> >>> /* >>> @@ -228,5 +313,14 @@ int p2m_set_allocation(struct domain *d, unsigned long >>> pages, bool *preempted) >>> } >>> } >>> >>> + /* >>> + * First, wait for the p2m pool to be initialized. Then allocate the >>> root >> Why "wait"? There's waiting here. > > I am not really get your question. > > "wait" here is about the initialization of the pool which happens above this > comment. But there's no "waiting" involved. What you talk about is one thing needing to happen after the other. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |