[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 2/2] iommu: elide flushing for higher order map/unmap operations



> -----Original Message-----
> From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf
> Of Jan Beulich
> Sent: 03 December 2018 15:03
> To: Paul Durrant <Paul.Durrant@xxxxxxxxxx>
> Cc: Kevin Tian <kevin.tian@xxxxxxxxx>; Stefano Stabellini
> <sstabellini@xxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>; Andrew Cooper
> <Andrew.Cooper3@xxxxxxxxxx>; Julien Grall <julien.grall@xxxxxxx>; Suravee
> Suthikulpanit <suravee.suthikulpanit@xxxxxxx>; xen-devel <xen-
> devel@xxxxxxxxxxxxxxxxxxxx>; Brian Woods <brian.woods@xxxxxxx>; Roger Pau
> Monne <roger.pau@xxxxxxxxxx>
> Subject: Re: [Xen-devel] [PATCH 2/2] iommu: elide flushing for higher
> order map/unmap operations
> 
> >>> On 30.11.18 at 11:45, <paul.durrant@xxxxxxxxxx> wrote:
> > This patch removes any implicit flushing that occurs in the
> implementation
> > of map and unmap operations and, instead, adds explicit flushing at the
> > end of the loops in the iommu_map/unmap() wrapper functions.
> >
> > Because VT-d currently performs two different types of flush dependent
> upon
> > whether a PTE is being modified versus merely added (i.e. replacing a
> non-
> > present PTE) a 'iommu_flush_type' enumeration is defined by this patch
> and
> > the iommu_ops map method is modified to pass back the type of flush
> > necessary for the PTE that has been populated. When a higher order
> mapping
> > operation is done, the wrapper code performs the 'highest' level of
> flush
> > required by the individual iommu_ops method calls, where a 'modified
> PTE'
> > flush is deemed to be higher than a 'added PTE' flush.
> 
> I'm afraid such ordering properties may not generally exist. That is,
> what you pass the flush handlers needs to be an OR of "added new
> entries" and "modified existing entries". That's because at least in
> the abstract case it may be that distinct flushes need to be issued
> for both cases (i.e. potentially two of them).

Ok, I can set things up that way instead.

> 
> > -static bool set_iommu_pte_present(unsigned long pt_mfn, unsigned long
> dfn,
> > -                                  unsigned long next_mfn, int
> pde_level,
> > -                                  bool iw, bool ir)
> > +static enum iommu_flush_type set_iommu_pte_present(
> > +    unsigned long pt_mfn, unsigned long dfn, unsigned long next_mfn,
> > +    int pde_level, bool iw, bool ir)
> >  {
> >      uint64_t *table;
> >      uint32_t *pde;
> > -    bool need_flush;
> > +    enum iommu_flush_type flush_type;
> >
> >      table = map_domain_page(_mfn(pt_mfn));
> >
> >      pde = (uint32_t *)(table + pfn_to_pde_idx(dfn, pde_level));
> >
> > -    need_flush = set_iommu_pde_present(pde, next_mfn, 0, iw, ir);
> > +    flush_type = set_iommu_pde_present(pde, next_mfn, 0, iw, ir);
> >      unmap_domain_page(table);
> > -    return need_flush;
> > +    return flush_type;
> >  }
> 
> Please take the opportunity and add the missing blank line.
> 

Ok.

> > @@ -629,8 +629,6 @@ int amd_iommu_unmap_page(struct domain *d, dfn_t
> dfn)
> >      clear_iommu_pte_present(pt_mfn[1], dfn_x(dfn));
> >
> >      spin_unlock(&hd->arch.mapping_lock);
> > -
> > -    amd_iommu_flush_pages(d, dfn_x(dfn), 0);
> >      return 0;
> >  }
> 
> Please retain the blank line.

Ok.

> 
> > @@ -700,12 +705,23 @@ int amd_iommu_reserve_domain_unity_map(struct
> domain *domain,
> >      for ( i = 0; i < npages; i++ )
> >      {
> >          unsigned long frame = gfn + i;
> > +        enum iommu_flush_type this_flush_type;
> >
> > -        rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame),
> flags);
> > +        rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame),
> flags,
> > +                                &this_flush_type);
> >          if ( rt != 0 )
> > -            return rt;
> > +            break;
> > +
> > +        flush_type = MAX(flush_type, this_flush_type);
> >      }
> > -    return 0;
> > +
> > +    /*
> > +     * The underlying implementation is void so the return value is
> > +     * meaningless and can hence be ignored.
> > +     */
> > +    ignored = amd_iommu_flush_iotlb_pages(domain, _dfn(gfn), npages,
> > +                                          flush_type);
> 
> I'm afraid such an assignment without subsequent use can be
> (legitimately) warned about by compilers. Hence the approach
> I had asked you to restore in one of your earlier patches. The
> exact same one won't fit here, but while ( ... ) break; should.
> Same then elsewhere.
> 

Ok.

> > @@ -319,11 +324,18 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t
> mfn,
> >
> >      for ( i = 0; i < (1ul << page_order); i++ )
> >      {
> > +        enum iommu_flush_type this_flush_type;
> > +        int ignore;
> > +
> >          rc = hd->platform_ops->map_page(d, dfn_add(dfn, i),
> > -                                        mfn_add(mfn, i), flags);
> > +                                        mfn_add(mfn, i), flags,
> > +                                        &this_flush_type);
> >
> >          if ( likely(!rc) )
> > +        {
> > +            flush_type = MAX(flush_type, this_flush_type);
> 
> With the comment above this is unlikely to stay anyway, but if it
> does please use max() instead. At least I can't see why you
> couldn't use the typesafe variant here.
> 

As you say, it will go away.

> > @@ -336,12 +348,19 @@ int iommu_map(struct domain *d, dfn_t dfn, mfn_t
> mfn,
> >              if ( hd->platform_ops->unmap_page(d, dfn_add(dfn, i)) )
> >                  continue;
> >
> > +        /* Something went wrong so attempt a full flush */
> > +        ignore = hd->platform_ops->iotlb_flush_all(d);
> > +
> >          if ( !is_hardware_domain(d) )
> >              domain_crash(d);
> >
> >          break;
> >      }
> >
> > +    if ( hd->platform_ops->iotlb_flush &&
> !this_cpu(iommu_dont_flush_iotlb) )
> > +        rc = hd->platform_ops->iotlb_flush(d, dfn, (1ul << page_order),
> 
> 1u only please, since the function parameter is unsigned int.
> 

Ok.

> > @@ -378,6 +397,10 @@ int iommu_unmap(struct domain *d, dfn_t dfn,
> unsigned int page_order)
> >          }
> >      }
> >
> > +    if ( hd->platform_ops->iotlb_flush )
> > +        rc = hd->platform_ops->iotlb_flush(d, dfn, (1ul << page_order),
> 
> Same here.
> 

Ok.

> > @@ -417,7 +440,9 @@ int iommu_iotlb_flush(struct domain *d, dfn_t dfn,
> unsigned int page_count)
> >      if ( !iommu_enabled || !hd->platform_ops || !hd->platform_ops-
> >iotlb_flush
> > )
> >          return 0;
> >
> > -    rc = hd->platform_ops->iotlb_flush(d, dfn, page_count);
> > +    /* Assume a 'modified' flush is required */
> > +    rc = hd->platform_ops->iotlb_flush(d, dfn, page_count,
> > +                                       IOMMU_FLUSH_modified);
> 
> As per above this would then become the OR of both flush modes.

Yes.

> 
> > --- a/xen/drivers/passthrough/vtd/iommu.c
> > +++ b/xen/drivers/passthrough/vtd/iommu.c
> > @@ -631,11 +631,14 @@ static int __must_check iommu_flush_iotlb(struct
> domain *d, dfn_t dfn,
> >      return rc;
> >  }
> >
> > -static int __must_check iommu_flush_iotlb_pages(struct domain *d,
> > -                                                dfn_t dfn,
> > -                                                unsigned int
> page_count)
> > +static int __must_check iommu_flush_iotlb_pages(
> > +    struct domain *d, dfn_t dfn, unsigned int page_count,
> > +    enum iommu_flush_type flush_type)
> 
> Is the re-flowing really needed?
> 

Yes. The enum is long and won't fit within 80 chars otherwise.

> >  {
> > -    return iommu_flush_iotlb(d, dfn, 1, page_count);
> > +    return (flush_type == IOMMU_FLUSH_none) ?
> > +           0 :
> > +           iommu_flush_iotlb(d, dfn, (flush_type ==
> IOMMU_FLUSH_modified),
> 
> Unnecessary parentheses.

Ok.

> 
> > @@ -674,9 +677,6 @@ static int __must_check dma_pte_clear_one(struct
> domain *domain, u64 addr)
> >      spin_unlock(&hd->arch.mapping_lock);
> >      iommu_flush_cache_entry(pte, sizeof(struct dma_pte));
> >
> > -    if ( !this_cpu(iommu_dont_flush_iotlb) )
> > -        rc = iommu_flush_iotlb_pages(domain, daddr_to_dfn(addr), 1);
> 
> This code not getting replaced by another addition right in this
> source file, and this function's only caller being
> intel_iommu_unmap_page() makes me wonder why you don't
> have the unmap functions similarly hand back a flush indicator.
> 

Well, the assumption is that unmap is always modifying an existing entry. Is 
that assumption wrong?

  Paul

> Jan
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxxx
> https://lists.xenproject.org/mailman/listinfo/xen-devel
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.