Xen project Mailing List

Re: [Xen-devel] [PATCH 1/4] xen: report how much memory a domain has on each NUMA node

To: "Dario Faggioli" <dario.faggioli@xxxxxxxxxx>

From: "Jan Beulich" <JBeulich@xxxxxxxx>

Date: Wed, 05 Mar 2014 16:44:02 +0000

Cc: Ian Campbell <Ian.Campbell@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Juergen Gross <juergen.gross@xxxxxxxxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxx>, Daniel De Graaf <dgdegra@xxxxxxxxxxxxx>

Delivery-date: Wed, 05 Mar 2014 16:44:31 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

>>> On 05.03.14 at 17:13, Dario Faggioli <dario.faggioli@xxxxxxxxxx> wrote: > On mer, 2014-03-05 at 15:04 +0000, Jan Beulich wrote: >> >>> On 05.03.14 at 15:36, Dario Faggioli <dario.faggioli@xxxxxxxxxx> wrote: >> > + if ( !memkb_on_node ) >> > + break; >> > + >> > + spin_lock(&d->page_alloc_lock); >> > + page_list_for_each(page, &d->page_list) >> >> No new non-preemptable potentially long running loops like this, >> please. See XSA-77. >> > Not super familiar with XSAs, but do you perhaps 45 ("Several long > latency operations are not preemptible", > xenbits.xenproject.org/xsa/advisory-45.html)? 77 seems to be about > "Hypercalls exposed to privilege rings 1 and 2 of HVM guests"... XSA-77 is "Disaggregated domain management security status". Is there anywhere this isn't shown that way? > In any case, I see what you mean, and Juergen was also pointing out that > this is going to take a lot. I of course agree, but was, when > implementing, not sure about what the best solution was. > > I initially thought of, as Juergen says, instrumenting page allocation > and adding the accounting there. However, I think that means doing > something very similar to having a MAX_NUMNODES big array for each > domain (or are there other ways?). That's an option, and MAX_NUMNODES shouldn't be that large. Another option is to check for preemption every so many iterations. Your only problem then is where to store the intermediate result. >> > + { >> > + node = phys_to_nid((paddr_t)page_to_mfn(page) << PAGE_SHIFT); >> >> Please don't open-code pfn_to_paddr()/page_to_maddr(). >> > Ok. BTW, for this, I looked at what dump_numa() in xen/arch/x86/numa.c > does. > > Should we fix that to --and I mean wrt both open coding, and also the > non-preemptability? Or it doesn't matter in this case, as it's just a > debug key handler? Non-preemptability doesn't matter for that very reason. But fixing open-coded constructs would certainly be nice. >> > --- a/xen/include/public/domctl.h >> > +++ b/xen/include/public/domctl.h >> > @@ -315,6 +315,26 @@ typedef struct xen_domctl_max_vcpus >> > xen_domctl_max_vcpus_t; >> > DEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t); >> > >> > >> > +/* XEN_DOMCTL_numainfo */ >> > +struct xen_domctl_numainfo { >> > + /* >> > + * IN: maximum addressable entry in the caller-provided arrays. >> > + * OUT: minimum between the maximum addressable entry in the >> > + * caller-provided arrays and largest online node identifier >> > + * in the system. >> > + */ >> > + uint32_t max_node_index; >> >> With that IN/OUT specification (and the matching implementation >> above), how would the caller know it passed too small a value to >> fit all information? >> > It doesn't. I designed it to be similar to XEN_SYSCTL_numainfo, which > retrieves _system_wide_ NUMA information. Should I use a new, OUT only, > parameter for the required number of elements (or index of the largest), > so that the caller can compare the twos and figure things out? Either that, or simply always update max_node_index to the needed number of elements (with the comment suitably updated) - after all you can be sure the caller knows the number it passed in. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.