[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/pv: Flush TLB in response to paging structure changes
On 21.10.2020 15:07, Andrew Cooper wrote: > @@ -4037,6 +4037,9 @@ long do_mmu_update( > break; > rc = mod_l2_entry(va, l2e_from_intpte(req.val), mfn, > cmd == MMU_PT_UPDATE_PRESERVE_AD, v); > + /* Paging structure maybe changed. Flush linear range. > */ > + if ( !rc ) > + flush_flags_all |= FLUSH_TLB; > break; > > case PGT_l3_page_table: > @@ -4044,6 +4047,9 @@ long do_mmu_update( > break; > rc = mod_l3_entry(va, l3e_from_intpte(req.val), mfn, > cmd == MMU_PT_UPDATE_PRESERVE_AD, v); > + /* Paging structure maybe changed. Flush linear range. > */ > + if ( !rc ) > + flush_flags_all |= FLUSH_TLB; > break; > > case PGT_l4_page_table: > @@ -4051,27 +4057,28 @@ long do_mmu_update( > break; > rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn, > cmd == MMU_PT_UPDATE_PRESERVE_AD, v); > - if ( !rc && pt_owner->arch.pv.xpti ) > + /* Paging structure maybe changed. Flush linear range. > */ > + if ( !rc ) > { > - bool local_in_use = false; > + bool local_in_use = mfn_eq( > + pagetable_get_mfn(curr->arch.guest_table), mfn); > > - if ( > mfn_eq(pagetable_get_mfn(curr->arch.guest_table), > - mfn) ) > - { > - local_in_use = true; > - get_cpu_info()->root_pgt_changed = true; > - } > + flush_flags_all |= FLUSH_TLB; > + > + if ( local_in_use ) > + flush_flags_local |= FLUSH_TLB | > FLUSH_ROOT_PGTBL; Aiui here (and in the code consuming the flags) you build upon flush_flags_local, when not zero, always being a superset of flush_flags_all. I think this is a trap to fall into when later wanting to change this code, but as per below this won't hold anymore anyway, I think. Hence here I think you want to set FLUSH_TLB unconditionally, and above for L3 and L2 you want to set it in both variables. Or, if I'm wrong below, a comment to that effect may help people avoid falling into this trap. An alternative would be to have flush_flags_local |= (flush_flags_all & FLUSH_TLB); before doing the actual flush. > /* > * No need to sync if all uses of the page can be > * accounted to the page lock we hold, its pinned > * status, and uses on this (v)CPU. > */ > - if ( (page->u.inuse.type_info & PGT_count_mask) > > + if ( pt_owner->arch.pv.xpti && I assume you've moved this here to avoid the further non-trivial checks when possible, but you've not put it around the setting of FLUSH_ROOT_PGTBL in flush_flags_local because setting ->root_pgt_changed is benign when !XPTI? > + (page->u.inuse.type_info & PGT_count_mask) > > (1 + !!(page->u.inuse.type_info & PGT_pinned) + > > mfn_eq(pagetable_get_mfn(curr->arch.guest_table_user), > mfn) + local_in_use) ) > - sync_guest = true; > + flush_flags_all |= FLUSH_ROOT_PGTBL; This needs to also specify FLUSH_TLB_GLOBAL, or else original XPTI behavior gets broken. Since the local CPU doesn't need this, the variable may then better be named flush_flags_remote? Or if I'm wrong here, the reasoning behind the dropping of the global flush in this case needs putting in the description, not the least because it would mean the change introducing it went too far. > @@ -4173,18 +4180,36 @@ long do_mmu_update( > if ( va ) > unmap_domain_page(va); > > - if ( sync_guest ) > + /* > + * Flushing needs to occur for one of several reasons. > + * > + * 1) An update to an L2 or higher occured. This potentially changes the > + * pagetable structure, requiring a flush of the linear range. > + * 2) An update to an L4 occured, and XPTI is enabled. All CPUs running > + * on a copy of this L4 need refreshing. > + */ > + if ( flush_flags_all || flush_flags_local ) Minor remark: At least in x86 code it is more efficient to use | instead of || in such cases, to avoid relying on the compiler to carry out this small optimization. It may well be that all compilers we care about do so, but it's certainly not the case for all the compilers I've ever worked with. > { > + cpumask_t *mask = pt_owner->dirty_cpumask; > + > /* > - * Force other vCPU-s of the affected guest to pick up L4 entry > - * changes (if any). > + * Local flushing may be asymmetric with remote. If there is local > + * flushing to do, perform it separately and omit the current CPU > from > + * pt_owner->dirty_cpumask. > */ > - unsigned int cpu = smp_processor_id(); > - cpumask_t *mask = per_cpu(scratch_cpumask, cpu); > + if ( flush_flags_local ) > + { > + unsigned int cpu = smp_processor_id(); > + > + mask = per_cpu(scratch_cpumask, cpu); > + cpumask_copy(mask, pt_owner->dirty_cpumask); > + __cpumask_clear_cpu(cpu, mask); > + > + flush_local(flush_flags_local); > + } > > - cpumask_andnot(mask, pt_owner->dirty_cpumask, cpumask_of(cpu)); I understand you're of the opinion that cpumask_copy() + __cpumask_clear_cpu() is more efficient than cpumask_andnot()? (I guess I agree for high enough CPU counts.) Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |