[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH RFC 06/12] xen/x86: populate PVHv2 Dom0 physical memory map



On Thu, Aug 04, 2016 at 07:43:39PM +0100, Andrew Cooper wrote:
> On 02/08/16 10:19, Roger Pau Monne wrote:
> > On Fri, Jul 29, 2016 at 08:04:12PM +0100, Andrew Cooper wrote:
> >> On 29/07/16 17:29, Roger Pau Monne wrote:
> >>> +/* Calculate the biggest usable order given a size in bytes. */
> >>> +static inline unsigned int get_order(uint64_t size)
> >>> +{
> >>> +    unsigned int order;
> >>> +    uint64_t pg;
> >>> +
> >>> +    ASSERT((size & ~PAGE_MASK) == 0);
> >>> +    pg = PFN_DOWN(size);
> >>> +    for ( order = 0; pg >= (1 << (order + 1)); order++ );
> >>> +
> >>> +    return order;
> >>> +}
> >> We already have get_order_from_bytes() and get_order_from_pages(), the
> >> latter of which looks like it will suit your usecase.
> > Not really, or at least they don't do the same as get_order. This function 
> > calculates the maximum order you can use so that there are no pages left 
> > over, (ie: if you have a size of 3145728bytes (3MiB), this function will 
> > return order 9 (2MiB), while the other ones will return order 10 (4MiB)). I 
> > don't really understand while other places in code request bigger orders 
> > and 
> > then free the leftovers, isn't this also causing memory shattering?
> 
> Sounds like we want something like get_order_{floor,ceil}() which makes
> it obvious which way non-power-of-two get rounded.

Right, that makes sense, will rename the current one to ceil, and add the 
floor variant.

> >>> +            if ( order == 0 && memflags )
> >>> +            {
> >>> +                /* Try again without any memflags. */
> >>> +                memflags = 0;
> >>> +                order = MAX_ORDER;
> >>> +                continue;
> >>> +            }
> >>> +            if ( order == 0 )
> >>> +                panic("Unable to allocate memory with order 0!\n");
> >>> +            order--;
> >>> +            continue;
> >>> +        }
> >> It would be far more efficient to try and allocate only 1G and 2M
> >> blocks.  Most of memory is free at this point, and it would allow the
> >> use of HAP superpage mappings, which will be a massive performance boost
> >> if they aren't shattered.
> > That's what I'm trying to do, but we might have to use pages of lower order 
> > when filling the smaller gaps.
> 
> As a general principle, we should try not to have any gaps.  This also
> extends to guests using more intelligence when deciding now to mutate
> its physmap.

Yes, but in this case we are limited by the original e820 from the host.
A DomU (without passthrough) will have all it's memory contiguously.
 
> >  As an example, this are the stats when 
> > building a domain with 6048M of RAM:
> >
> > (XEN) Memory allocation stats:
> > (XEN) Order 18: 5GB
> > (XEN) Order 17: 512MB
> > (XEN) Order 15: 256MB
> > (XEN) Order 14: 128MB
> > (XEN) Order 12: 16MB
> > (XEN) Order 10: 8MB
> > (XEN) Order  9: 4MB
> > (XEN) Order  8: 2MB
> > (XEN) Order  7: 1MB
> > (XEN) Order  6: 512KB
> > (XEN) Order  5: 256KB
> > (XEN) Order  4: 128KB
> > (XEN) Order  3: 64KB
> > (XEN) Order  2: 32KB
> > (XEN) Order  1: 16KB
> > (XEN) Order  0: 4KB
> >
> > IMHO, they are quite good.
> 
> What are the RAM characteristics of the host?  Do you have any idea what
> the hap superpage characteristics are like after the guest has booted?

This is the host RAM map:

(XEN)  0000000000000000 - 000000000009c800 (usable)
(XEN)  000000000009c800 - 00000000000a0000 (reserved)
(XEN)  00000000000e0000 - 0000000000100000 (reserved)
(XEN)  0000000000100000 - 00000000ad662000 (usable)
(XEN)  00000000ad662000 - 00000000adb1f000 (reserved)
(XEN)  00000000adb1f000 - 00000000b228b000 (usable)
(XEN)  00000000b228b000 - 00000000b2345000 (reserved)
(XEN)  00000000b2345000 - 00000000b236a000 (ACPI data)
(XEN)  00000000b236a000 - 00000000b2c9a000 (ACPI NVS)
(XEN)  00000000b2c9a000 - 00000000b2fff000 (reserved)
(XEN)  00000000b2fff000 - 00000000b3000000 (usable)
(XEN)  00000000b3800000 - 00000000b8000000 (reserved)
(XEN)  00000000f8000000 - 00000000fc000000 (reserved)
(XEN)  00000000fec00000 - 00000000fec01000 (reserved)
(XEN)  00000000fed00000 - 00000000fed04000 (reserved)
(XEN)  00000000fed1c000 - 00000000fed20000 (reserved)
(XEN)  00000000fee00000 - 00000000fee01000 (reserved)
(XEN)  00000000ff000000 - 0000000100000000 (reserved)
(XEN)  0000000100000000 - 0000000247000000 (usable)

No idea about the HAP superpage characteristics, how can I fetch this 
information? (I know I can dump the guest EPT tables, but that just 
saturates the console).

> In a case like this, I think it would be entirely reasonable to round up
> to the nearest 2MB, and avoid all of those small page mappings.

Hm, but then we would be expanding the RAM region and we should either 
modify the guest e820 to reflect that (which I think it's a bad idea, given 
that we might be shadowing MMIO regions), or simply use a 2MB page to cover 
a 4KB hole, in which case we are throwing away memory and the computation of 
the required memory will be off.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.