[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.14 v2] x86/tlb: fix assisted flush usage



On Tue, Jun 23, 2020 at 04:46:29PM +0100, Julien Grall wrote:
> 
> 
> On 23/06/2020 16:15, Roger Pau Monné wrote:
> > On Tue, Jun 23, 2020 at 04:08:06PM +0100, Julien Grall wrote:
> > > Hi Roger,
> > > 
> > > On 23/06/2020 15:50, Roger Pau Monne wrote:
> > > > diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
> > > > index 9b62087be1..f360166f00 100644
> > > > --- a/xen/include/xen/mm.h
> > > > +++ b/xen/include/xen/mm.h
> > > > @@ -639,7 +639,8 @@ static inline void accumulate_tlbflush(bool 
> > > > *need_tlbflush,
> > > >        }
> > > >    }
> > > > -static inline void filtered_flush_tlb_mask(uint32_t tlbflush_timestamp)
> > > > +static inline void filtered_flush_tlb_mask(uint32_t tlbflush_timestamp,
> > > > +                                           bool sync)
> > > 
> > > I read the commit message and went through the code, but it is still not
> > > clear what "sync" means in a non-architectural way.
> > > 
> > > As an Arm developper, I would assume this means we don't wait for the TLB
> > > flush to complete. But I am sure this would result to some unexpected
> > > behavior.
> > 
> > No, when we return from filtered_flush_tlb_mask the flush has been
> > performed (either with sync or without), but I understand the
> > confusion given the parameter name.
> > 
> > > So can you explain on non-x86 words what this really mean?
> > 
> > sync (in this context) means to force the usage of an IPI (if built
> > with PV or shadow paging support) in order to perform the flush.
> 
> This is compare to?

Using assisted flushes, like you do on Arm, where you don't send an
IPI in order to achieve a TLB flush on a remote pCPU.

> > AFAICT on Arm you always avoid an IPI when performing a flush, and
> > that's fine because you don't have PV or shadow, and then you don't
> > require this.
> 
> Arm provides instructions to broadcast TLB flush, so by the time one of
> instruction is completed there is all the TLB entry associated to the
> translation doesn't exist.
> 
> I don't think using PV or shadow would change anything here in the way we do
> the flush.

TBH, I'm not sure how this applies to Arm. There's no PV or shadow
implementation, so I have no idea whether this would apply or not.

> > Also I'm not sure Arm has the concept of a spurious
> > page fault.
> 
> So if I understand correctly, the HW may raise a fault even if the mapping
> was in the page-tables. Is it correct?

Yes, this can happen when you promote the permission of a page table
entry without doing a TLB flush AFAICT. Ie: you have a read-only page,
which is promoted to writable, but you don't perform a TLB flush and
just rely on getting a page fault that will clear the TLB entry and
retry.

> We have a spurious page fault handler for stage-2 (aka EPT on x86) as we
> need to have an invalid mapping to transition for certain page-tables update
> (e.g. superpage shattering). We are using the same rwlock with the page
> fault handler and the page-table update, so there is no way the two can run
> concurrently.

This is slightly different as it's used by PV page tables, so the
fault is triggered much more often than the fault handler you are
referring to IMO.

> > 
> > I could rename to force_ipi (or require_ipi) if that's better?
> 
> Hmmm, based on what I wrote above, I don't think this name would be more
> suitable. However, regardless the name, it is not clear to me when a caller
> should use false or true.
> 
> Have you considered a rwlock to synchronize the two?

Yes, the performance drop is huge when I tried. I could try to refine,
but I think there's always going to be a performance drop, as you then
require mutual exclusion when modifying the page tables (you take the
lock in write mode). Right now modification of the page tables can be
done concurrently.

FWIW Xen explicitly moved from using a lock into this model in
3203345bb13 apparently due to some deadlock situation. I'm not sure
if that still holds.

Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.