|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/pv: Flush TLB in response to paging structure changes
On 21/10/2020 14:56, Jan Beulich wrote:
> On 21.10.2020 15:07, Andrew Cooper wrote:
>> @@ -4037,6 +4037,9 @@ long do_mmu_update(
>> break;
>> rc = mod_l2_entry(va, l2e_from_intpte(req.val), mfn,
>> cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
>> + /* Paging structure maybe changed. Flush linear range.
>> */
>> + if ( !rc )
>> + flush_flags_all |= FLUSH_TLB;
>> break;
>>
>> case PGT_l3_page_table:
>> @@ -4044,6 +4047,9 @@ long do_mmu_update(
>> break;
>> rc = mod_l3_entry(va, l3e_from_intpte(req.val), mfn,
>> cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
>> + /* Paging structure maybe changed. Flush linear range.
>> */
>> + if ( !rc )
>> + flush_flags_all |= FLUSH_TLB;
>> break;
>>
>> case PGT_l4_page_table:
>> @@ -4051,27 +4057,28 @@ long do_mmu_update(
>> break;
>> rc = mod_l4_entry(va, l4e_from_intpte(req.val), mfn,
>> cmd == MMU_PT_UPDATE_PRESERVE_AD, v);
>> - if ( !rc && pt_owner->arch.pv.xpti )
>> + /* Paging structure maybe changed. Flush linear range.
>> */
>> + if ( !rc )
>> {
>> - bool local_in_use = false;
>> + bool local_in_use = mfn_eq(
>> + pagetable_get_mfn(curr->arch.guest_table), mfn);
>>
>> - if (
>> mfn_eq(pagetable_get_mfn(curr->arch.guest_table),
>> - mfn) )
>> - {
>> - local_in_use = true;
>> - get_cpu_info()->root_pgt_changed = true;
>> - }
>> + flush_flags_all |= FLUSH_TLB;
>> +
>> + if ( local_in_use )
>> + flush_flags_local |= FLUSH_TLB |
>> FLUSH_ROOT_PGTBL;
> Aiui here (and in the code consuming the flags) you build upon
> flush_flags_local, when not zero, always being a superset of
> flush_flags_all. I think this is a trap to fall into when later
> wanting to change this code, but as per below this won't hold
> anymore anyway, I think. Hence here I think you want to set
> FLUSH_TLB unconditionally, and above for L3 and L2 you want to
> set it in both variables. Or, if I'm wrong below, a comment to
> that effect may help people avoid falling into this trap.
>
> An alternative would be to have
>
> flush_flags_local |= (flush_flags_all & FLUSH_TLB);
>
> before doing the actual flush.
Honestly, this is what I meant by stating that the asymmetry is a total
mess.
I originally named all 'remote', but this is even less accurate, it may
still contain the current cpu.
Our matrix of complexity:
* FLUSH_TLB for L2+ structure changes
* FLUSH_TLB_GLOBAL/FLUSH_ROOT_PGTBL for XPTI
with optimisations to skip GLOBAL/ROOT_PGTBL on the local CPU if none of
the updates hit the L4-in-use, and to skip the remote if we hold all
references on the L4.
Everything is complicated because pt_owner may not be current, for
toolstack operations constructing a new domain.
>
>> /*
>> * No need to sync if all uses of the page can be
>> * accounted to the page lock we hold, its pinned
>> * status, and uses on this (v)CPU.
>> */
>> - if ( (page->u.inuse.type_info & PGT_count_mask) >
>> + if ( pt_owner->arch.pv.xpti &&
> I assume you've moved this here to avoid the further non-trivial
> checks when possible, but you've not put it around the setting
> of FLUSH_ROOT_PGTBL in flush_flags_local because setting
> ->root_pgt_changed is benign when !XPTI?
No - that was accidental, while attempting to reduce the diff.
>
>> + (page->u.inuse.type_info & PGT_count_mask) >
>> (1 + !!(page->u.inuse.type_info & PGT_pinned) +
>>
>> mfn_eq(pagetable_get_mfn(curr->arch.guest_table_user),
>> mfn) + local_in_use) )
>> - sync_guest = true;
>> + flush_flags_all |= FLUSH_ROOT_PGTBL;
> This needs to also specify FLUSH_TLB_GLOBAL, or else original
> XPTI behavior gets broken. Since the local CPU doesn't need this,
> the variable may then better be named flush_flags_remote?
See above. remote is even more confusing than all.
>
> Or if I'm wrong here, the reasoning behind the dropping of the
> global flush in this case needs putting in the description, not
> the least because it would mean the change introducing it went
> too far.
>
>> @@ -4173,18 +4180,36 @@ long do_mmu_update(
>> if ( va )
>> unmap_domain_page(va);
>>
>> - if ( sync_guest )
>> + /*
>> + * Flushing needs to occur for one of several reasons.
>> + *
>> + * 1) An update to an L2 or higher occured. This potentially changes
>> the
>> + * pagetable structure, requiring a flush of the linear range.
>> + * 2) An update to an L4 occured, and XPTI is enabled. All CPUs running
>> + * on a copy of this L4 need refreshing.
>> + */
>> + if ( flush_flags_all || flush_flags_local )
> Minor remark: At least in x86 code it is more efficient to use
> | instead of || in such cases, to avoid relying on the compiler
> to carry out this small optimization.
This transformation should not be recommended in general. There are
cases, including this one, where it is at best, no effect, and at worst,
wrong.
I don't care about people using ancient compilers. They've got far
bigger (== more impactful) problems than (the absence of) this
transformation, and the TLB flush will dwarf anything the compiler does
here.
However, the hand "optimised" case breaks a compiler spotting that the
entire second clause is actually redundant for now.
I specifically didn't encode the dependency, to avoid subtle bugs
if/when someone alters the logic.
>
>> {
>> + cpumask_t *mask = pt_owner->dirty_cpumask;
>> +
>> /*
>> - * Force other vCPU-s of the affected guest to pick up L4 entry
>> - * changes (if any).
>> + * Local flushing may be asymmetric with remote. If there is local
>> + * flushing to do, perform it separately and omit the current CPU
>> from
>> + * pt_owner->dirty_cpumask.
>> */
>> - unsigned int cpu = smp_processor_id();
>> - cpumask_t *mask = per_cpu(scratch_cpumask, cpu);
>> + if ( flush_flags_local )
>> + {
>> + unsigned int cpu = smp_processor_id();
>> +
>> + mask = per_cpu(scratch_cpumask, cpu);
>> + cpumask_copy(mask, pt_owner->dirty_cpumask);
>> + __cpumask_clear_cpu(cpu, mask);
>> +
>> + flush_local(flush_flags_local);
>> + }
>>
>> - cpumask_andnot(mask, pt_owner->dirty_cpumask, cpumask_of(cpu));
> I understand you're of the opinion that cpumask_copy() +
> __cpumask_clear_cpu() is more efficient than cpumask_andnot()?
> (I guess I agree for high enough CPU counts.)
Its faster in all cases, even low CPU counts.
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |