[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2] xen/arm: p2m: Correctly flush TLB in create_p2m_entries
Hrm, our TLB flush discipline is horribly confused isn't it... On Thu, 2014-01-09 at 16:34 +0000, Julien Grall wrote: > The p2m is shared between VCPUs for each domain. Currently Xen only flush > TLB on the local PCPU. This could result to mismatch between the mapping in > the > p2m and TLBs. > > Flush TLB entries used by this domain on every PCPU. The flush can also be > moved out of the loop because: > - ALLOCATE: only called for dom0 RAM allocation, so the flush is never > called OK. An ASSERT(!third[third_table_offset(addr)].p2m.valid) might be worthwhile if that is the case. (I'm not sure why ALLOCATE can't be replaced by allocation followed by an INSERT, it's seems very special case) > - INSERT: if valid = 1 that would means with have replaced a > page that already belongs to the domain. A VCPU can write on the wrong > page. > This can append for dom0 with the 1:1 mapping because the mapping is not > removed from the p2m. "append"? Do you mean "happen"? In the non-dom0 1:1 case eventually the page will be freed, I guess by a subsequent put_page elsewhere -- do they all contain the correct flushing? Or do we just leak? What happens if the page being replaced is foreign? Do we leak a reference to another domain's page? (This is probably a separate issue though, I suspect the put_page needs pulling out of the switch(op) statement). > - REMOVE: except for grant-table (replace_grant_host_mapping), What about this case? What makes it safe? > each > call to guest_physmap_remove_page are protected by the callers via a > get_page -> .... -> guest_physmap_remove_page -> ... -> put_page. So > the page can't be allocated for another domain until the last put_page. I have confirmed this is the case for guest_remove_page, memory_exchange and XENMEM_remove_from_physmap. There is one case I saw where this isn't true which is gnttab_transfer, AFAICT that will fail because steal_page unconditionally returns an error on arm. There is also a flush_tlb_mask there, FWIW. > - RELINQUISH : the domain is not running anymore so we don't care... At some point this page will be reused, as will the VMID, where/how is it ensured that a flush will happen before that point? So I think the main outstanding question is what makes replace_grant_host_mapping safe. The other big issue is the leak of a foreign mapping on INSERT, but I think that is separate. > Signed-off-by: Julien Grall <julien.grall@xxxxxxxxxx> > > --- > Changes in v2: > - Switch to the domain for only flush its TLBs entries > - Move the flush out of the loop > > This is a possible bug fix (found by reading the code) for Xen 4.4, I moved > the > flush out of the loop which should be safe (see why in the commit message). > Without this patch, the guest can have stale TLB entries when the VCPU is > moved > to another PCPU. > > Except grant-table (I can't find {get,put}_page for grant-table code???), > all the callers are protected by a get_page before removing the page. So if > the > another VCPU is trying to access to this page before the flush, it will just > read/write the wrong page. > > The downside of this patch is Xen flushes less TLBs. Instead of flushing all > TLBs > on the current PCPU, Xen flushes TLBs for a specific VMID on every CPUs. This > should be safe because create_p2m_entries only deal with a specific domain. > > I don't think I forget case in this function. Let me know if it's the case. > --- > xen/arch/arm/p2m.c | 24 +++++++++++++++++++----- > 1 file changed, 19 insertions(+), 5 deletions(-) > > diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c > index 11f4714..ad6f76e 100644 > --- a/xen/arch/arm/p2m.c > +++ b/xen/arch/arm/p2m.c > @@ -238,7 +238,7 @@ static int create_p2m_entries(struct domain *d, > int mattr, > p2m_type_t t) > { > - int rc, flush; > + int rc; > struct p2m_domain *p2m = &d->arch.p2m; > lpae_t *first = NULL, *second = NULL, *third = NULL; > paddr_t addr; > @@ -246,10 +246,14 @@ static int create_p2m_entries(struct domain *d, > cur_first_offset = ~0, > cur_second_offset = ~0; > unsigned long count = 0; > + unsigned int flush = 0; > bool_t populate = (op == INSERT || op == ALLOCATE); > > spin_lock(&p2m->lock); > > + if ( d != current->domain ) > + p2m_load_VTTBR(d); > + > addr = start_gpaddr; > while ( addr < end_gpaddr ) > { > @@ -316,7 +320,7 @@ static int create_p2m_entries(struct domain *d, > cur_second_offset = second_table_offset(addr); > } > > - flush = third[third_table_offset(addr)].p2m.valid; > + flush |= third[third_table_offset(addr)].p2m.valid; > > /* Allocate a new RAM page and attach */ > switch (op) { > @@ -373,9 +377,6 @@ static int create_p2m_entries(struct domain *d, > break; > } > > - if ( flush ) > - flush_tlb_all_local(); > - > /* Preempt every 2MiB (mapped) or 32 MiB (unmapped) - arbitrary */ > if ( op == RELINQUISH && count >= 0x2000 ) > { > @@ -392,6 +393,16 @@ static int create_p2m_entries(struct domain *d, > addr += PAGE_SIZE; > } > > + if ( flush ) > + { > + /* At the beginning of the function, Xen is updating VTTBR > + * with the domain where the mappings are created. In this > + * case it's only necessary to flush TLBs on every CPUs with > + * the current VMID (our domain). > + */ > + flush_tlb(); > + } > + > if ( op == ALLOCATE || op == INSERT ) > { > unsigned long sgfn = paddr_to_pfn(start_gpaddr); > @@ -409,6 +420,9 @@ out: > if (second) unmap_domain_page(second); > if (first) unmap_domain_page(first); > > + if ( d != current->domain ) > + p2m_load_VTTBR(current->domain); > + > spin_unlock(&p2m->lock); > > return rc; _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |