|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v7 2/3] xen/mm: Introduce NUMA-aware memory claim sets
On 08.05.2026 22:27, Bernhard Kaindl wrote: > This commit extends Xen's memory claim design to support installing claim > sets spanning multiple NUMA nodes atomically. As Roger Pau Monné suggested: > > Ideally, we would need to introduce a new hypercall that allows > making claims from multiple nodes in a single locked region, as to > ensure success or failure in an atomic way. > > A claim set can contain multiple node-specific claims and a host-wide > claim for memory that may come from any NUMA node. The new domctl > installs the full claim set atomically, and the allocator is updated > so that claim checks and claim consumption follow the new semantics. > > This adds: > > 1. installing multi-node claim sets atomically, > 2. protecting claimed pages from other claim requests and allocations, and > 3. redeeming held claims when satisfying allocations. > > Legacy XENMEM_claim_pages behaviour is preserved; the interface is > deprecated and superseded by XEN_DOMCTL_claim_memory. > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > Signed-off-by: Bernhard Kaindl <bernhard.kaindl@xxxxxxxxxx> > > --- > The v7 design document submitted ahead of this series may help with review. > It explains the background, design rationale, and implementation details. > > Rendered version: https://xen.kaindl.dev/claims-v7-design/designs/claims > > Many thanks to everyone who contributed to the earlier work and review: > especially Roger Pau Monné, Alejandro Vallejo, Jan Beulich, Andrew Cooper, > Marcus Granado, and Edwin Török. > > Thanks, > Bernhard > --- > tools/include/xenctrl.h | 11 + > tools/libs/ctrl/xc_domain.c | 28 ++ > xen/common/domain.c | 5 +- > xen/common/domctl.c | 57 ++++ > xen/common/memory.c | 5 +- > xen/common/page_alloc.c | 410 +++++++++++++++++++++++----- > xen/include/public/domctl.h | 38 +++ > xen/include/public/memory.h | 2 + > xen/include/xen/mm.h | 6 +- > xen/include/xen/sched.h | 4 + > xen/xsm/flask/hooks.c | 1 + > xen/xsm/flask/policy/access_vectors | 1 + > 12 files changed, 493 insertions(+), 75 deletions(-) And there's no way to split this some? I'll only do some light review for now. > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -1319,7 +1319,10 @@ int domain_kill(struct domain *d) > rspin_barrier(&d->domain_lock); > argo_destroy(d); > vnuma_destroy(d->vnuma); > - domain_set_outstanding_pages(d, 0); > + /* Release all outstanding claims of the domain. */ > + domain_set_claim_entries(d, 1, &(xen_memory_claim_t){ > + .target = XEN_DOMCTL_CLAIM_MEMORY_HOST, .pages = 0, > + }); This is pretty badly formatted. Maybe domain_set_claim_entries( d, 1, &(xen_memory_claim_t){ .target = XEN_DOMCTL_CLAIM_MEMORY_HOST, .pages = 0, }); ? (Applies elsewhere as well, obviously.) Whether the .pages field needs explicitly mentioning is questionable. > --- a/xen/common/domctl.c > +++ b/xen/common/domctl.c > @@ -51,6 +51,57 @@ static int xenctl_bitmap_to_nodemask(nodemask_t *nodemask, > MAX_NUMNODES); > } > > +/* Set or get memory claims for a domain. */ > +static int claim_memory(struct domain *d, > + struct xen_domctl_claim_memory *uinfo, bool > *copyback) > +{ > + xen_memory_claim_t *entries; > + int rc = -EFAULT; > + > + /* Reject LLC coloring; alloc_color_heap_page() does not handle claims. > */ > + if ( llc_coloring_enabled ) > + return -EOPNOTSUPP; > + > + switch ( uinfo->mode ) > + { > + case XEN_DOMCTL_CLAIM_MEMORY_SET: > + if ( !uinfo->nr_entries ) > + return -EINVAL; > + if ( uinfo->nr_entries > MAX_NUMNODES + 1 ) Why's MAX_NUMNODES and MAX_NUMNODES + 1 as an input permitted? > + return -E2BIG; > + break; > + case XEN_DOMCTL_CLAIM_MEMORY_GET: > + if ( uinfo->nr_entries > MAX_NUMNODES + 1 ) > + uinfo->nr_entries = MAX_NUMNODES + 1; > + break; > + default: > + return -EOPNOTSUPP; > + } Here and elsewhere - blank lines please between non-fall-through case blocks. > + if ( d->is_dying ) > + return -ESRCH; > + > + entries = xmalloc_array(xen_memory_claim_t, uinfo->nr_entries); Better xzalloc_array() (really xvzalloc_array()) seeing ... > + if ( entries == NULL ) > + return -ENOMEM; > + > + switch ( uinfo->mode ) > + { > + case XEN_DOMCTL_CLAIM_MEMORY_SET: > + if ( !copy_from_guest(entries, uinfo->claim_set, uinfo->nr_entries) ) > + rc = domain_set_claim_entries(d, uinfo->nr_entries, entries); > + break; > + case XEN_DOMCTL_CLAIM_MEMORY_GET: > + rc = domain_get_claim_entries(d, &uinfo->nr_entries, entries); > + *copyback = true; > + if ( !rc && copy_to_guest(uinfo->claim_set, entries, > + uinfo->nr_entries) ) ... this copy-out. > + rc = -EFAULT; > + break; > + } Missing default:. > @@ -865,6 +916,12 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) > u_domctl) > copyback = true; > break; > > + case XEN_DOMCTL_claim_memory: > + ret = xsm_claim_pages(XSM_PRIV, d); This and the xsm/flask/hooks.c change will need reworking, to fit the spirit of XSA-492. > --- a/xen/include/public/domctl.h > +++ b/xen/include/public/domctl.h > @@ -1276,6 +1276,42 @@ struct xen_domctl_get_domain_state { > uint64_t unique_id; /* Unique domain identifier. */ > }; > > +struct xen_memory_claim { > + uint64_aligned_t pages; /* Number of pages to claim. */ > + uint32_t target; /* NUMA node or special target constant. */ > + uint32_t cmd; /* Reserved, must be zero. */ Why is this named "cmd" when it's reserved? > +}; > +typedef struct xen_memory_claim xen_memory_claim_t; > +DEFINE_XEN_GUEST_HANDLE(xen_memory_claim_t); > + > +/* Special claim targets for the target field of xen_memory_claim_t. */ > +#define XEN_DOMCTL_CLAIM_MEMORY_HOST 0x80000000U /* Host-wide claims. */ > +#define XEN_DOMCTL_CLAIM_MEMORY_LEGACY 0x40000000U /* Legacy semantics. */ I think the latter should be internal to Xen, and in particular not be permitted with this new domctl. > +/* > + * XEN_DOMCTL_claim_memory > + * > + * Install or query a domain memory claim set. A SET operation replaces the > + * existing claim set atomically. Claims are redeemed by later allocations to > + * the domain. A SET request whose entries all have pages == 0 releases any > + * existing claims. > + * > + * For GET, callers may pass nr_entries == 0 and claim_set == NULL to query > the > + * number of records needed. Xen returns -ERANGE and updates nr_entries. If > the > + * supplied array is too small, Xen returns -ERANGE and updates nr_entries > + * without copying partial records. > + */ > +struct xen_domctl_claim_memory { > + /* IN/OUT: Array of struct xen_memory_claim. */ > + XEN_GUEST_HANDLE_64(xen_memory_claim_t) claim_set; > + /* IN/OUT: Number of records in the claim_set array. */ > + uint32_t nr_entries; What is IN and what is OUT looks to differ between GET and SET. > + /* IN: Operation to perform on the claim set (GET or SET). */ > + uint32_t mode; > +#define XEN_DOMCTL_CLAIM_MEMORY_SET 0U /* Set the claim set for the domain. > */ > +#define XEN_DOMCTL_CLAIM_MEMORY_GET 1U /* Get the claim set of the domain. */ > +}; And there's no way to fetch overall claims? > @@ -131,7 +132,10 @@ int populate_pt_range(unsigned long virt, unsigned long > nr_mfns); > /* Claim handling */ > unsigned long __must_check domain_adjust_tot_pages(struct domain *d, > long pages); > -int domain_set_outstanding_pages(struct domain *d, unsigned long pages); > +int domain_set_claim_entries(struct domain *d, uint32_t nr_entries, > + const struct xen_memory_claim *claim_set); > +int domain_get_claim_entries(struct domain *d, uint32_t *nr_entries, > + struct xen_memory_claim *claim_set); Please see ./CODING_STYLE as to the use of fixed-width types. In the latter case I can see justification, but in the former I can't. > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -419,6 +419,10 @@ struct domain > unsigned int xenheap_pages; /* pages allocated from Xen heap */ > /* Pages claimed but not possessed, protected by global heap_lock. */ > unsigned int outstanding_pages; > + unsigned int node_claims; /* Sum of per-node claims. */ > + /* Domain objects use dedicated pages, leaving room for per-node claims. > */ > + unsigned int claims[MAX_NUMNODES]; /* Per-NUMA-node claims. */ Potentially too big an array to embed directly here. Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |