[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 7/7] xen/x86: use PCID feature
>>> On 27.03.18 at 11:07, <jgross@xxxxxxxx> wrote: > --- a/xen/arch/x86/domain_page.c > +++ b/xen/arch/x86/domain_page.c > @@ -51,7 +51,7 @@ static inline struct vcpu *mapcache_current_vcpu(void) > if ( (v = idle_vcpu[smp_processor_id()]) == current ) > sync_local_execstate(); > /* We must now be running on the idle page table. */ > - ASSERT(read_cr3() == __pa(idle_pg_table)); > + ASSERT((read_cr3() & ~X86_CR3_PCID_MASK) == __pa(idle_pg_table)); This would better use X86_CR3_ADDR_MASK as well. > @@ -102,7 +103,21 @@ void write_cr3_cr4(unsigned long cr3, unsigned long cr4) > t = pre_flush(); > > if ( read_cr4() & X86_CR4_PGE ) > + /* > + * X86_CR4_PGE set means PCID being inactive. > + * We have to purge the TLB via flipping cr4.pge. > + */ > write_cr4(cr4 & ~X86_CR4_PGE); > + else if ( cpu_has_invpcid ) > + /* > + * If we are using PCID purge the TLB via INVPCID as loading cr3 > + * will affect the current PCID only. s/current/new/ ? > + * If INVPCID is not supported we don't use PCIDs so loading cr3 > + * will purge the TLB (we are in the "global pages off" branch). > + * invpcid_flush_all_nonglobals() seems to be faster than > + * invpcid_flush_all(). > + */ > + invpcid_flush_all_nonglobals(); > > asm volatile ( "mov %0, %%cr3" : : "r" (cr3) : "memory" ); What about the TLB entries that have been re-created between the INVPCID and the write of CR3? It's not obvious to me that leaving them around is not going to be a problem subsequently, the more that you write cr3 frequently with the no-flush bit set. Don't you need to do a single context invalidation for the prior PCID (if different from the new one)? > @@ -499,7 +500,26 @@ void free_shared_domheap_page(struct page_info *page) > > void make_cr3(struct vcpu *v, mfn_t mfn) > { > + struct domain *d = v->domain; > + > v->arch.cr3 = mfn_x(mfn) << PAGE_SHIFT; > + if ( is_pv_domain(d) && d->arch.pv_domain.pcid ) > + v->arch.cr3 |= get_pcid_bits(v, false); > +} > + > +unsigned long pv_guest_cr4_to_real_cr4(struct vcpu *v) const > +{ > + struct domain *d = v->domain; again > @@ -298,11 +362,21 @@ int pv_domain_initialise(struct domain *d) > > static void _toggle_guest_pt(struct vcpu *v, bool force_cr3) > { > + struct domain *d = v->domain; and one more > --- a/xen/include/asm-x86/x86-defns.h > +++ b/xen/include/asm-x86/x86-defns.h > @@ -45,7 +45,9 @@ > /* > * Intel CPU flags in CR3 > */ > -#define X86_CR3_NOFLUSH (_AC(1, ULL) << 63) > +#define X86_CR3_NOFLUSH (_AC(1, ULL) << 63) > +#define X86_CR3_ADDR_MASK (PAGE_MASK & ~X86_CR3_NOFLUSH) This would better be PAGE_MASK & PADDR_MASK: bits 52...62 are reserved (the respective figure in chapter 2 of the SDM is not to be trusted, the tables in the "4-level paging" section are more likely to be correct). Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |