Xen project Mailing List

Re: [Xen-devel] [PATCH 3/3] xen/arm: p2m: Perform local TLB invalidation on vCPU migration

From: Stefano Stabellini <sstabellini@xxxxxxxxxx>

Date: Wed, 8 Mar 2017 10:58:44 -0800 (PST)

Cc: andre.przywara@xxxxxxx, sstabellini@xxxxxxxxxx, punit.agrawal@xxxxxxx, xen-devel@xxxxxxxxxxxxx

Delivery-date: Wed, 08 Mar 2017 18:58:56 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Wed, 8 Mar 2017, Julien Grall wrote: > The ARM architecture allows an OS to have per-CPU page tables, as it > guarantees that TLBs never migrate from one CPU to another. > > This works fine until this is done in a guest. Consider the following > scenario: > - vcpu-0 maps P to V > - vpcu-1 maps P' to V > > If run on the same physical CPU, vcpu-1 can hit in TLBs generated by > vcpu-0 accesses, and access the wrong physical page. > > The solution to this is to keep a per-p2m map of which vCPU ran the last > on each given pCPU and invalidate local TLBs if two vPCUs from the same > VM run on the same CPU. > > Unfortunately it is not possible to allocate per-cpu variable on the > fly. So for now the size of the array is NR_CPUS, this is fine because > we still have space in the structure domain. We may want to add an > helper to allocate per-cpu variable in the future. I think I understand the problem, thanks for the description. I have a question: when a vcpu of a different domain is scheduled on a pcpu, there is no impact on tlb flushes in relation to this problem, is there? For example: 1) vcpu0/domain1 -> pcpu0 2) vcpu0/domain2 -> pcpu0 3) vcpu1/domain1 -> pcpu0 In 3) we still need to do the flush_tlb_local(), no matter if 2) happened or not, right? Also, I have one suggestion below. > Signed-off-by: Julien Grall <julien.grall@xxxxxxx> > > --- > This patch is a candidate for backport to Xen 4.8, 4.7 and 4.6. > --- > xen/arch/arm/p2m.c | 24 ++++++++++++++++++++++++ > xen/include/asm-arm/p2m.h | 3 +++ > 2 files changed, 27 insertions(+) > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c > index 1fc6ca3bb2..626376090d 100644 > --- a/xen/arch/arm/p2m.c > +++ b/xen/arch/arm/p2m.c > @@ -130,6 +130,7 @@ void p2m_restore_state(struct vcpu *n) > { > register_t hcr; > struct p2m_domain *p2m = &n->domain->arch.p2m; > + uint8_t *last_vcpu_ran; > > if ( is_idle_vcpu(n) ) > return; > @@ -149,6 +150,17 @@ void p2m_restore_state(struct vcpu *n) > > WRITE_SYSREG(hcr, HCR_EL2); > isb(); > + > + last_vcpu_ran = &p2m->last_vcpu_ran[smp_processor_id()]; > + > + /* > + * Flush local TLB for the domain to prevent wrong TLB translation > + * when running multiple vCPU of the same domain on a single pCPU. > + */ > + if ( *last_vcpu_ran != INVALID_VCPU_ID && *last_vcpu_ran != n->vcpu_id ) > + flush_tlb_local(); > + > + *last_vcpu_ran = n->vcpu_id; > } > > static void p2m_flush_tlb(struct p2m_domain *p2m) > @@ -1247,6 +1259,7 @@ int p2m_init(struct domain *d) > { > struct p2m_domain *p2m = &d->arch.p2m; > int rc = 0; > + unsigned int cpu; > > rwlock_init(&p2m->lock); > INIT_PAGE_LIST_HEAD(&p2m->pages); > @@ -1275,6 +1288,17 @@ int p2m_init(struct domain *d) > > rc = p2m_alloc_table(d); > > + /* > + * Make sure that the type chosen to is able to store the an vCPU ID > + * between 0 and the maximum of virtual CPUS supported as long as > + * the INVALID_VCPU_ID. > + */ > + BUILD_BUG_ON((1 << (sizeof(p2m->last_vcpu_ran[0]) * 8)) < MAX_VIRT_CPUS); > + BUILD_BUG_ON((1 << (sizeof(p2m->last_vcpu_ran[0])* 8)) < > INVALID_VCPU_ID); > + > + for_each_possible_cpu(cpu) > + p2m->last_vcpu_ran[cpu] = INVALID_VCPU_ID; > + > return rc; > } > > diff --git a/xen/include/asm-arm/p2m.h b/xen/include/asm-arm/p2m.h > index 0899523084..18c57f936e 100644 > --- a/xen/include/asm-arm/p2m.h > +++ b/xen/include/asm-arm/p2m.h > @@ -96,6 +96,9 @@ struct p2m_domain { > > /* back pointer to domain */ > struct domain *domain; > + > + /* Keeping track on which CPU this p2m was used and for which vCPU */ > + uint8_t last_vcpu_ran[NR_CPUS]; > }; NR_CPUS is usually much larger than cpu_possible_map. I think it would be best to allocate last_vcpu_ran dynamically (after cpu_possible_map is set). It should reduce the memory impact significantly. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.