[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/4] xen: report how much memory a domain has on each NUMA node



>>> On 05.03.14 at 17:13, Dario Faggioli <dario.faggioli@xxxxxxxxxx> wrote:
> On mer, 2014-03-05 at 15:04 +0000, Jan Beulich wrote:
>> >>> On 05.03.14 at 15:36, Dario Faggioli <dario.faggioli@xxxxxxxxxx> wrote:
>> > +        if ( !memkb_on_node )
>> > +            break;
>> > +
>> > +        spin_lock(&d->page_alloc_lock);
>> > +        page_list_for_each(page, &d->page_list)
>> 
>> No new non-preemptable potentially long running loops like this,
>> please. See XSA-77.
>> 
> Not super familiar with XSAs, but do you perhaps 45 ("Several long
> latency operations are not preemptible",
> xenbits.xenproject.org/xsa/advisory-45.html)? 77 seems to be about
> "Hypercalls exposed to privilege rings 1 and 2 of HVM guests"...

XSA-77 is "Disaggregated domain management security status".
Is there anywhere this isn't shown that way?

> In any case, I see what you mean, and Juergen was also pointing out that
> this is going to take a lot. I of course agree, but was, when
> implementing, not sure about what the best solution was.
> 
> I initially thought of, as Juergen says, instrumenting page allocation
> and adding the accounting there. However, I think that means doing
> something very similar to having a MAX_NUMNODES big array for each
> domain (or are there other ways?).

That's an option, and MAX_NUMNODES shouldn't be that large.
Another option is to check for preemption every so many
iterations. Your only problem then is where to store the
intermediate result.

>> > +        {
>> > +            node = phys_to_nid((paddr_t)page_to_mfn(page) << PAGE_SHIFT);
>> 
>> Please don't open-code pfn_to_paddr()/page_to_maddr().
>> 
> Ok. BTW, for this, I looked at what dump_numa() in xen/arch/x86/numa.c
> does.
> 
> Should we fix that to --and I mean wrt both open coding, and also the
> non-preemptability? Or it doesn't matter in this case, as it's just a
> debug key handler?

Non-preemptability doesn't matter for that very reason. But
fixing open-coded constructs would certainly be nice.

>> > --- a/xen/include/public/domctl.h
>> > +++ b/xen/include/public/domctl.h
>> > @@ -315,6 +315,26 @@ typedef struct xen_domctl_max_vcpus 
>> > xen_domctl_max_vcpus_t;
>> >  DEFINE_XEN_GUEST_HANDLE(xen_domctl_max_vcpus_t);
>> >  
>> >  
>> > +/* XEN_DOMCTL_numainfo */
>> > +struct xen_domctl_numainfo {
>> > +    /*
>> > +     * IN: maximum addressable entry in the caller-provided arrays.
>> > +     * OUT: minimum between the maximum addressable entry in the
>> > +     *      caller-provided arrays and largest online node identifier
>> > +     *      in the system.
>> > +     */
>> > +    uint32_t max_node_index;
>> 
>> With that IN/OUT specification (and the matching implementation
>> above), how would the caller know it passed too small a value to
>> fit all information?
>> 
> It doesn't. I designed it to be similar to XEN_SYSCTL_numainfo, which
> retrieves _system_wide_ NUMA information. Should I use a new, OUT only,
> parameter for the required number of elements (or index of the largest),
> so that the caller can compare the twos and figure things out?

Either that, or simply always update max_node_index to the
needed number of elements (with the comment suitably updated)
- after all you can be sure the caller knows the number it passed in.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.