[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH] Start PV guest faster



>>> On 20.05.14 at 09:26, <frediano.ziglio@xxxxxxxxxx> wrote:
> Experimental patch that try to allocate large chunks in order to start
> PV guest quickly.

The fundamental idea is certainly welcome.

> It's a while I noticed that the time to start a large PV guest depends
> on the amount of memory. For VMs with 64 or more GB of ram the time can
> become quite significant (like 20 seconds). Digging around I found that
> a lot of time is spend populating RAM (from a single hypercall made by
> xenguest).

Did you check whether - like noticed elsewhere - this is due to
excessive hypercall preemption/restart? I.e. whether making
the preemption checks less fine grained helps?

> The improvement is quite significant (the hypercall is more than 20
> times faster for a machine with 3GB) however there are different things
> to consider:
> - should this optimization be done inside Xen? If the change is just
> userspace surely this make Xen simpler and safer but on the other way
> Xen is more aware if is better to allocate big chunks or not

Except that Xen has no way to tell what "better" here would be.

> - debug Xen return pages in reverse order while the chunks have to be
> allocated sequentially. Is this a problem?

I think the ability to populate guest memory with (largely, but not
necessarily entirely) discontiguous memory should be retained for
debugging purposes (see also below).

> I didn't find any piece of code where superpages is turned on in
> xc_dom_image but I think that if the number of pages is not multiple of
> superpages the code allocate a bit less memory for the guest.

I think that's expected - I wonder whether that code is really in use
by anyone...

> @@ -820,9 +831,11 @@ int arch_setup_meminit(struct xc_dom_image *dom)
>              allocsz = dom->total_pages - i;
>              if ( allocsz > 1024*1024 )
>                  allocsz = 1024*1024;
> -            rc = xc_domain_populate_physmap_exact(
> -                dom->xch, dom->guest_domid, allocsz,
> -                0, 0, &dom->p2m_host[i]);
> +            /* try bit chunk of memory first */
> +            if ( (allocsz & ((1<<10)-1)) == 0 )
> +                rc = populate_range(dom, &dom->p2m_host[i], i, 10, allocsz);
> +            if ( rc )
> +                rc = populate_range(dom, &dom->p2m_host[i], i, 0, allocsz);

So on what basis was 10 chosen here? I wonder whether this
shouldn't be
(a) smaller by default,
(b) configurable (globally or even per guest),
(c) dependent on the total memory getting assigned to the guest,
(d) tried with sequentially decreasing order after failure.

Additionally you're certainly aware that allocation failures lead to
hypervisor log messages (as today already seen when HVM guests
can't have their order-18 or order-9 allocations fulfilled). We may
need to think about ways to suppress these messages for such
allocations where the caller intends to retry with a smaller order.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.