[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/domain_page: implement pure per-vCPU mapping infrastructure



Thanks, I welcome effort to make Xen more scalable. :-) 

On Mon, Feb 03, 2020 at 06:36:53PM +0000, Hongyan Xia wrote:
> Rewrite the mapcache to be purely per-vCPU instead of partly per-vCPU
> and partly per-domain.
> 
> This patch is needed to address performance issues when we start relying
> on the mapcache, e.g., when we do not have a direct map. Currently, the
> per-domain lock on the mapcache is a bottleneck for multicore, causing
> performance degradation and even functional regressions. This patch
> makes the mapping structure per-vCPU and completely lockless.
> 

When I see "regression", I think of something that was working before
but not anymore. I don't think that's the case for the following two
things. 

I would just classify them as bug and/or improvement.

> Functional regression:
> 
> When a domain is run on more than 64 cores, FreeBSD 10 panicks frequently
> due to occasional simultaneous set_singleshot_timer hypercalls from too
> many cores. Some cores will be blocked waiting on map_domain_page,
> eventually failing to set a timer in the future. FreeBSD cannot handle
> this and panicks. This was fixed in later FreeBSD releases by handling
> -ETIME, but still the degradation in timer performance is a big issue.
> 
> Performance regression:
> 
> Many benchmarks see a performance drop when having a large core count.
> I have done a Geekbench on a 32-vCPU guest.
> 
> perf drop     old        new
> single       0.04%      0.18%
> multi        2.43%      0.08%
> 
> Removing the per-domain lock in the mapcache brings the multi-core
> performance almost identical to using the direct map for mappings.
> 
> There should be room for futher optimisations, but this already
> improves over the old mapcache by a lot.
> 
> Note that entries in the maphash will occupy inuse slots. With 16 slots
> per vCPU and a maphash capacity of 8, we only have another 8 available,
> which is not enough for nested page table walks. We need to increase the
> number of slots in config.h.
> 
> Signed-off-by: Hongyan Xia <hongyxia@xxxxxxxxxx>


As far as I can tell all vcpus still share the same per-domain mapcache
region. The difference is now the region is divided into subregion for
each vcpu to use.

> ---
>  xen/arch/x86/domain.c        |   5 +-
>  xen/arch/x86/domain_page.c   | 229 +++++++++++------------------------
>  xen/include/asm-x86/config.h |   2 +-
>  xen/include/asm-x86/domain.h |  30 +----
>  4 files changed, 80 insertions(+), 186 deletions(-)
> 
> diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
> index f53ae5ff86..a278aa4678 100644
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -445,6 +445,9 @@ void arch_vcpu_destroy(struct vcpu *v)
>      xfree(v->arch.msrs);
>      v->arch.msrs = NULL;
>  
> +    xfree(v->arch.pv.mapcache);
> +    v->arch.pv.mapcache = NULL;
> +

Keep in mind that accessing the union {pv, hvm} before knowing the
exact variant is dangerous.

Because mapcache is initialised for PV only, so it should be freed for
PV only. I think you need to put it to pv_vcpu_destroy.

If you don't want to do that because of asymmetry with
mapcache_vcpu_init, you may want to invent mapcache_vcpu_destroy for
your purpose.

(I will need to pull this patch to a branch to see the final incarnation
before I review further)

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.