[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v8 1/3] x86/tlb: introduce a flush HVM ASIDs flag



On Wed, Apr 01, 2020 at 08:34:23AM +0200, Jan Beulich wrote:
> On 31.03.2020 18:45, Roger Pau Monné wrote:
> > On Tue, Mar 31, 2020 at 05:40:59PM +0200, Jan Beulich wrote:
> >> On 20.03.2020 19:42, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/mm/hap/hap.c
> >>> +++ b/xen/arch/x86/mm/hap/hap.c
> >>> @@ -118,7 +118,7 @@ int hap_track_dirty_vram(struct domain *d,
> >>>              p2m_change_type_range(d, begin_pfn, begin_pfn + nr,
> >>>                                    p2m_ram_rw, p2m_ram_logdirty);
> >>>  
> >>> -            flush_tlb_mask(d->dirty_cpumask);
> >>> +            flush_mask(d->dirty_cpumask, FLUSH_TLB | 
> >>> FLUSH_HVM_ASID_CORE);
> >>>  
> >>>              memset(dirty_bitmap, 0xff, size); /* consider all pages 
> >>> dirty */
> >>>          }
> >>> @@ -205,7 +205,7 @@ static int hap_enable_log_dirty(struct domain *d, 
> >>> bool_t log_global)
> >>>           * to be read-only, or via hardware-assisted log-dirty.
> >>>           */
> >>>          p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
> >>> -        flush_tlb_mask(d->dirty_cpumask);
> >>> +        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>>      }
> >>>      return 0;
> >>>  }
> >>> @@ -234,7 +234,7 @@ static void hap_clean_dirty_bitmap(struct domain *d)
> >>>       * be read-only, or via hardware-assisted log-dirty.
> >>>       */
> >>>      p2m_change_entry_type_global(d, p2m_ram_rw, p2m_ram_logdirty);
> >>> -    flush_tlb_mask(d->dirty_cpumask);
> >>> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>>  }
> >>>  
> >>>  /************************************************/
> >>> @@ -798,7 +798,7 @@ hap_write_p2m_entry(struct p2m_domain *p2m, unsigned 
> >>> long gfn, l1_pgentry_t *p,
> >>>  
> >>>      safe_write_pte(p, new);
> >>>      if ( old_flags & _PAGE_PRESENT )
> >>> -        flush_tlb_mask(d->dirty_cpumask);
> >>> +        flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>
> >> For all four - why FLUSH_TLB? Doesn't the flushing here solely care
> >> about guest translations?
> > 
> > Not on AMD, at least to my understanding, the AMD SDM states:
> > 
> > "If a hypervisor modifies a nested page table by decreasing permission
> > levels, clearing present bits, or changing address translations and
> > intends to return to the same ASID, it should use either TLB command
> > 011b or 001b."
> > 
> > It's in section 15.16.1.
> > 
> > This to my understanding implies that on AMD hardware modifications to
> > the NPT require an ASID flush. I assume that on AMD ASIDs also cache
> > combined translations, guest linear -> host physical.
> 
> I guess I don't follow - I asked about FLUSH_TLB. I agree there needs
> to be FLUSH_HVM_ASID_CORE here.

I clearly misread your comment, I'm really sorry.

The main point of this patch was to remove the ASID flush from some of
the flush TLB callers, not to remove TLB flushes. I understand this
was all intertwined together, and now that it's possible to split
those there are a lot of callers that can be further refined. I think
such further improvements should be done in a separate patch, as it's
IMO a tricky area that can trigger very subtle bugs.

> >>> --- a/xen/arch/x86/mm/hap/nested_hap.c
> >>> +++ b/xen/arch/x86/mm/hap/nested_hap.c
> >>> @@ -84,7 +84,7 @@ nestedp2m_write_p2m_entry(struct p2m_domain *p2m, 
> >>> unsigned long gfn,
> >>>      safe_write_pte(p, new);
> >>>  
> >>>      if (old_flags & _PAGE_PRESENT)
> >>> -        flush_tlb_mask(p2m->dirty_cpumask);
> >>> +        flush_mask(p2m->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>
> >> Same here then I guess.
> >>
> >>> --- a/xen/arch/x86/mm/p2m-pt.c
> >>> +++ b/xen/arch/x86/mm/p2m-pt.c
> >>> @@ -896,7 +896,8 @@ static void p2m_pt_change_entry_type_global(struct 
> >>> p2m_domain *p2m,
> >>>      unmap_domain_page(tab);
> >>>  
> >>>      if ( changed )
> >>> -         flush_tlb_mask(p2m->domain->dirty_cpumask);
> >>> +         flush_mask(p2m->domain->dirty_cpumask,
> >>> +                    FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>
> >> Given that this code is used in shadow mode as well, perhaps
> >> better to keep it here. Albeit maybe FLUSH_TLB could be dependent
> >> upon !hap_enabled()?
> >>
> >>> --- a/xen/arch/x86/mm/paging.c
> >>> +++ b/xen/arch/x86/mm/paging.c
> >>> @@ -613,7 +613,7 @@ void paging_log_dirty_range(struct domain *d,
> >>>  
> >>>      p2m_unlock(p2m);
> >>>  
> >>> -    flush_tlb_mask(d->dirty_cpumask);
> >>> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>
> >> Same here?
> > 
> > I'm fine with doing further refinements, but I would like to be on the
> > conservative side and keep such flushes.
> 
> Well, if hap.c had FLUSH_TLB dropped, for consistency it should
> become conditional here, imo.
> 
> >>> @@ -993,7 +993,7 @@ static void shadow_blow_tables(struct domain *d)
> >>>                                 
> >>> pagetable_get_mfn(v->arch.shadow_table[i]), 0);
> >>>  
> >>>      /* Make sure everyone sees the unshadowings */
> >>> -    flush_tlb_mask(d->dirty_cpumask);
> >>> +    flush_mask(d->dirty_cpumask, FLUSH_TLB | FLUSH_HVM_ASID_CORE);
> >>
> >> Taking this as example, wouldn't it be more consistent overall if
> >> paths not being HVM-only would specify FLUSH_HVM_ASID_CORE only
> >> for HVM domains?

I could introduce something specific for shadow:

sh_flush_tlb_mask(d, m) \
    flush_mask(m, FLUSH_TLB | (is_hvm_domain(d) ? FLUSH_HVM_ASID_CORE : 0))

And likely a similar macro for hap, that uses hap_enabled.

> > I think there's indeed room for improvement here, as it's likely
> > possible to drop some of the ASID/VPID flushes. Given that previous to
> > this patch we would flush ASIDs on every TLB flush I think the current
> > approach is safe, and as said above further improvements can be done
> > afterwards.
> 
> There's no safety implication from my suggestion. Needless
> FLUSH_HVM_ASID_CORE for non-HVM will result in a call to
> hvm_flush_guest_tlbs(), with it then causing the generation
> to be incremented without there being any vCPU to consume
> this, as there's not going to be a VM entry without a prior
> context switch on the specific CPU.

As said above, I would rather do this in smaller steps, as I already
had plenty of fun with this change, but anyway. I think what I
proposed is correct and already an improvement from the current
status.

Will prepare a new version with your suggestions.

> >> Also, seeing the large number of conversions, perhaps have another
> >> wrapper, e.g. flush_tlb_mask_hvm(), at least for the cases where
> >> both flags get specified unconditionally?
> > 
> > That's fine for me, if you agree with the proposed naming
> > (flush_tlb_mask_hvm) I'm happy to introduce the helper.
> 
> Well, I couldn't (and still can't) think of a better (yet not
> overly long) name, yet I'm also not fully happy with it.

I'm going to use the proposed name unless someone comes up with a
better suggestion that has consensus.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.