[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations



> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf
> Of Jan Beulich
> Sent: 04 December 2018 16:02
> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>
> Cc: Kevin Tian <kevin.tian@xxxxxxxxx>; Stefano Stabellini
> <sstabellini@xxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>; Konrad Rzeszutek
> Wilk <konrad.wilk@xxxxxxxxxx>; Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>;
> Tim (Xen.org) <tim@xxxxxxx>; George Dunlap <George.Dunlap@xxxxxxxxxx>;
> Julien Grall <julien.grall@xxxxxxx>; Suravee Suthikulpanit
> <suravee.suthikulpanit@xxxxxxx>; xen-devel <xen-
> devel@xxxxxxxxxxxxxxxxxxxx>; Ian Jackson <Ian.Jackson@xxxxxxxxxx>; Brian
> Woods <brian.woods@xxxxxxx>; Roger Pau Monne <roger.pau@xxxxxxxxxx>
> Subject: Re: [Xen-devel] [PATCH v2 3/4] iommu: elide flushing for higher
> order map/unmap operations
> 
> >>> On 04.12.18 at 16:36, <Paul.Durrant@xxxxxxxxxx> wrote:
> >> From: Jan Beulich [mailto:JBeulich@xxxxxxxx]
> >> Sent: 04 December 2018 15:17
> >>
> >> >>> On 03.12.18 at 18:40, <paul.durrant@xxxxxxxxxx> wrote:
> >> > --- a/xen/arch/arm/p2m.c
> >> > +++ b/xen/arch/arm/p2m.c
> >> > @@ -971,8 +971,17 @@ static int __p2m_set_entry(struct p2m_domain
> *p2m,
> >> >
> >> >      if ( need_iommu_pt_sync(p2m->domain) &&
> >> >           (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
> >> > +    {
> >> > +        unsigned int flush_flags = 0;
> >> > +
> >> > +        if ( lpae_is_valid(orig_pte) )
> >> > +            flush_flags |= IOMMU_FLUSHF_modified;
> >> > +        if ( lpae_is_valid(*entry) )
> >> > +            flush_flags |= IOMMU_FLUSHF_added;
> >>
> >> Shouldn't this be "else if" with the meaning assigned to both
> >> types? From an abstract perspective I'd also expect that for
> >> a single mapping no more than one of the flags can come
> >> back set (through the iommu_ops interface).
> >
> > That's not how I see it. My rationale is:
> >
> > - present PTE made non-present, or modified -> IOMMU_FLUSHF_modified
> > - new PTE value is present -> IOMMU_FLUSHF_added
> >
> > So, a single op can set any combination of bits and thus the above code
> does
> > not use 'else if'.
> 
> I can't fit this with the code comments:
> 
> enum
> {
>     _IOMMU_FLUSHF_added, /* no modified entries, just additional entries
> */
>     _IOMMU_FLUSHF_modified, /* modified entries */
> };
> 
> ..., in particular the "no modified entries" part.

That was supposed to emphasize the need for the other flag was needed the case 
of a mapping that modifies an existing entry... I'll re-state using a block 
comment.

> 
> >> > @@ -84,7 +86,7 @@ static bool set_iommu_pde_present(uint32_t *pde,
> >> unsigned long next_mfn,
> >> >
> >> >          if ( maddr_old != maddr_next || iw != old_w || ir != old_r
> ||
> >> >               old_level != next_level )
> >> > -            need_flush = true;
> >> > +            flush_flags = IOMMU_FLUSHF_modified;
> >>
> >> Why uniformly "modified"?
> >
> > Because the AMD IOMMU does require flushing for a non-present -> present
> > transition AFAICT. The old code certainly implies this.
> 
> It is one thing what the flush function does with the value, but
> another whether the modifying function "lies". I'm not opposed
> to simplification, but then a comment needs to explain this.
> 

Ok. Maybe it is simpler not to omit requesting the 'added' flush here and then 
just ignore it in the flush method.

> >> > @@ -235,6 +236,9 @@ void __hwdom_init iommu_hwdom_init(struct domain
> *d)
> >> >                  process_pending_softirqs();
> >> >          }
> >> >
> >> > +        while ( !flush_flags && iommu_flush_all(d) )
> >> > +            break;
> >>
> >> Is there a comment missing here explaining the seemingly odd
> >> loop?
> >
> > I'm merely using the construct you suggested, but I can add a comment.
> 
> And I'm fine with the construct, but in the other place (for which
> we did discuss this for the earlier version) there is a comment.
>

Ok. I'll add a similar comment.

> >> > --- a/xen/drivers/passthrough/vtd/iommu.c
> >> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> >> > @@ -633,11 +633,14 @@ static int __must_check
> iommu_flush_iotlb(struct
> >> domain *d, dfn_t dfn,
> >> >
> >> >  static int __must_check iommu_flush_iotlb_pages(struct domain *d,
> >> >                                                  dfn_t dfn,
> >> > -                                                unsigned int
> >> page_count)
> >> > +                                                unsigned int
> >> page_count,
> >> > +                                                unsigned int
> >> flush_flags)
> >> >  {
> >> >      ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN));
> >> > +    ASSERT(flush_flags);
> >> >
> >> > -    return iommu_flush_iotlb(d, dfn, 1, page_count);
> >> > +    return iommu_flush_iotlb(d, dfn, flush_flags &
> >> IOMMU_FLUSHF_modified,
> >> > +                             page_count);
> >>
> >> Why the restriction to "modified"?
> >
> > The parameter is a bool which should be true if an existing PTE was
> modified
> > or false otherwise. I can make this !!(flush_flags &
> IOMMU_FLUSHF_modified) is
> > you prefer.
> 
> No, that wasn't my point. The question is why this isn't just
> "flush_flags", without any masking. Iirc there are precautions
> in the VT-d code to deal with hardware which may cache
> non-present entries. In that case "added" requires flushing too.
> 

I don't understand. iommu_flush_iotlb()'s third argument is 
'dma_old_pte_present' so that should be true iff IOMMU_FLUSHF_modified is set. 
IOMMU_FLUSHF_added is irrelevant to the implementation.

  Paul

> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxxx
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.