[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2 3/4] iommu: elide flushing for higher order map/unmap operations
> -----Original Message----- > From: Jan Beulich [mailto:JBeulich@xxxxxxxx] > Sent: 04 December 2018 15:17 > To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> > Cc: Brian Woods <brian.woods@xxxxxxx>; Suravee Suthikulpanit > <suravee.suthikulpanit@xxxxxxx>; Julien Grall <julien.grall@xxxxxxx>; > Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>; Roger Pau Monne > <roger.pau@xxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>; George Dunlap > <George.Dunlap@xxxxxxxxxx>; Ian Jackson <Ian.Jackson@xxxxxxxxxx>; Kevin > Tian <kevin.tian@xxxxxxxxx>; Stefano Stabellini <sstabellini@xxxxxxxxxx>; > xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>; Konrad Rzeszutek Wilk > <konrad.wilk@xxxxxxxxxx>; Tim (Xen.org) <tim@xxxxxxx> > Subject: Re: [PATCH v2 3/4] iommu: elide flushing for higher order > map/unmap operations > > >>> On 03.12.18 at 18:40, <paul.durrant@xxxxxxxxxx> wrote: > > This patch removes any implicit flushing that occurs in the > implementation > > of map and unmap operations and adds new iommu_map/unmap() wrapper > > functions. To maintain sematics of the iommu_legacy_map/unmap() wrapper > > functions, these are modified to call the new wrapper functions and then > > perform an explicit flush operation. > > > > Because VT-d currently performs two different types of flush dependent > upon > > whether a PTE is being modified versus merely added (i.e. replacing a > non- > > present PTE) 'iommu flush flags' are defined by this patch and the > > iommu_ops map_page() and unmap_page() methods are modified to OR the > type > > of flush necessary for the PTE that has been populated or depopulated > into > > an accumulated flags value. The accumulated value can then be passed > into > > the explicit flush operation. > > > > The ARM SMMU implementations of map_page() and unmap_page() currently > > perform no implicit flushing and therefore the modified methods do not > > adjust the flush flags. > > Which, however, likely is wrong. If we mean the flushing to be initiated > by the arch- and vendor-independent wrappers, then all map/unmap > backends should indicate the needed kind of flush. Granted this can be > done later, if things are otherwise correct on Arm right now. > > > --- a/xen/arch/arm/p2m.c > > +++ b/xen/arch/arm/p2m.c > > @@ -971,8 +971,17 @@ static int __p2m_set_entry(struct p2m_domain *p2m, > > > > if ( need_iommu_pt_sync(p2m->domain) && > > (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) ) > > + { > > + unsigned int flush_flags = 0; > > + > > + if ( lpae_is_valid(orig_pte) ) > > + flush_flags |= IOMMU_FLUSHF_modified; > > + if ( lpae_is_valid(*entry) ) > > + flush_flags |= IOMMU_FLUSHF_added; > > Shouldn't this be "else if" with the meaning assigned to both > types? From an abstract perspective I'd also expect that for > a single mapping no more than one of the flags can come > back set (through the iommu_ops interface). That's not how I see it. My rationale is: - present PTE made non-present, or modified -> IOMMU_FLUSHF_modified - new PTE value is present -> IOMMU_FLUSHF_added So, a single op can set any combination of bits and thus the above code does not use 'else if'. > > > @@ -84,7 +86,7 @@ static bool set_iommu_pde_present(uint32_t *pde, > unsigned long next_mfn, > > > > if ( maddr_old != maddr_next || iw != old_w || ir != old_r || > > old_level != next_level ) > > - need_flush = true; > > + flush_flags = IOMMU_FLUSHF_modified; > > Why uniformly "modified"? Because the AMD IOMMU does require flushing for a non-present -> present transition AFAICT. The old code certainly implies this. > > > @@ -645,11 +648,13 @@ static unsigned long flush_count(unsigned long > dfn, unsigned int page_count, > > } > > > > int amd_iommu_flush_iotlb_pages(struct domain *d, dfn_t dfn, > > - unsigned int page_count) > > + unsigned int page_count, > > + unsigned int flush_flags) > > { > > unsigned long dfn_l = dfn_x(dfn); > > > > ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > > + ASSERT(flush_flags & IOMMU_FLUSHF_modified); > > Is this valid? What if a map operation solely re-established what > was already there? Aiui in that case set_iommu_pde_present() > would always return zero. Or take this (seeing that the generic > wrapper has a zero check for the flush flags): Yes, the ASSERT is there because this should never be called unless flush_flags != 0 (ensured by the wrapper) and the map code should only ever set IOMMU_FLUSHF_modified. > > > @@ -692,6 +697,7 @@ int amd_iommu_reserve_domain_unity_map(struct domain > *domain, > > unsigned long npages, i; > > unsigned long gfn; > > unsigned int flags = !!ir; > > + unsigned int flush_flags = 0; > > int rt = 0; > > > > if ( iw ) > > @@ -703,11 +709,21 @@ int amd_iommu_reserve_domain_unity_map(struct > domain *domain, > > { > > unsigned long frame = gfn + i; > > > > - rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame), > flags); > > + rt = amd_iommu_map_page(domain, _dfn(frame), _mfn(frame), > flags, > > + &flush_flags); > > if ( rt != 0 ) > > - return rt; > > + break; > > } > > - return 0; > > + > > + /* > > + * The underlying implementation is void so the return value is > > + * meaningless and can hence be ignored. > > + */ > > + while ( amd_iommu_flush_iotlb_pages(domain, _dfn(gfn), npages, > > + flush_flags) ) > > + break; > > Nothing here guarantees flush_flags to be non-zero. Good point. I'll add a check. > > > @@ -235,6 +236,9 @@ void __hwdom_init iommu_hwdom_init(struct domain *d) > > process_pending_softirqs(); > > } > > > > + while ( !flush_flags && iommu_flush_all(d) ) > > + break; > > Is there a comment missing here explaining the seemingly odd > loop? I'm merely using the construct you suggested, but I can add a comment. > > > @@ -381,6 +402,17 @@ int iommu_legacy_unmap(struct domain *d, dfn_t dfn, > unsigned int page_order) > > return rc; > > } > > > > +int iommu_legacy_unmap(struct domain *d, dfn_t dfn, unsigned int > page_order) > > +{ > > + unsigned int flush_flags = 0; > > + int rc = iommu_unmap(d, dfn, page_order, &flush_flags); > > + > > + if ( !rc ) > > + rc = iommu_flush(d, dfn, (1u << page_order), flush_flags); > > No iommu_dont_flush_iotlb check needed here? I thought the old VT-d unmap code ignored it, but I see it didn't so yes I do need to add the check. > > > --- a/xen/drivers/passthrough/vtd/iommu.c > > +++ b/xen/drivers/passthrough/vtd/iommu.c > > @@ -633,11 +633,14 @@ static int __must_check iommu_flush_iotlb(struct > domain *d, dfn_t dfn, > > > > static int __must_check iommu_flush_iotlb_pages(struct domain *d, > > dfn_t dfn, > > - unsigned int > page_count) > > + unsigned int > page_count, > > + unsigned int > flush_flags) > > { > > ASSERT(page_count && !dfn_eq(dfn, INVALID_DFN)); > > + ASSERT(flush_flags); > > > > - return iommu_flush_iotlb(d, dfn, 1, page_count); > > + return iommu_flush_iotlb(d, dfn, flush_flags & > IOMMU_FLUSHF_modified, > > + page_count); > > Why the restriction to "modified"? The parameter is a bool which should be true if an existing PTE was modified or false otherwise. I can make this !!(flush_flags & IOMMU_FLUSHF_modified) is you prefer. > > > @@ -1825,15 +1825,18 @@ static int __must_check > intel_iommu_map_page(struct domain *d, > > spin_unlock(&hd->arch.mapping_lock); > > unmap_vtd_domain_page(page); > > > > - if ( !this_cpu(iommu_dont_flush_iotlb) ) > > - rc = iommu_flush_iotlb(d, dfn, dma_pte_present(old), 1); > > + *flush_flags |= IOMMU_FLUSHF_added; > > + if ( dma_pte_present(old) ) > > + *flush_flags |= IOMMU_FLUSHF_modified; > > See my earlier comment as to only one of them to get set for an > individual mapping. > > > @@ -62,14 +61,15 @@ int arch_iommu_populate_page_table(struct domain *d) > > { > > unsigned long mfn = mfn_x(page_to_mfn(page)); > > unsigned long gfn = mfn_to_gmfn(d, mfn); > > + unsigned int flush_flags = 0; > > > > if ( gfn != gfn_x(INVALID_GFN) ) > > { > > ASSERT(!(gfn >> DEFAULT_DOMAIN_ADDRESS_WIDTH)); > > BUG_ON(SHARED_M2P(gfn)); > > - rc = iommu_legacy_map(d, _dfn(gfn), _mfn(mfn), > > - PAGE_ORDER_4K, IOMMUF_readable | > > - IOMMUF_writable); > > + rc = iommu_map(d, _dfn(gfn), _mfn(mfn), > > + PAGE_ORDER_4K, IOMMUF_readable | > > + IOMMUF_writable, &flush_flags); > > } > > if ( rc ) > > { > > @@ -103,7 +103,6 @@ int arch_iommu_populate_page_table(struct domain *d) > > } > > > > spin_unlock(&d->page_alloc_lock); > > - this_cpu(iommu_dont_flush_iotlb) = 0; > > > > if ( !rc ) > > rc = iommu_flush_all(d); > > Would be nice to have a comment here clarifying why flush_flags > doesn't get used. Ok. > > > @@ -249,6 +251,9 @@ void __hwdom_init arch_iommu_hwdom_init(struct > domain *d) > > if (!(i & 0xfffff)) > > process_pending_softirqs(); > > } > > + > > + if ( !flush_flags && iommu_flush_all(d) ) > > + return; > > } > > Again please attach a brief comment explaining the seemingly > strange construct. > Ok. > > --- a/xen/include/xen/iommu.h > > +++ b/xen/include/xen/iommu.h > > @@ -93,6 +93,22 @@ void iommu_teardown(struct domain *d); > > #define _IOMMUF_writable 1 > > #define IOMMUF_writable (1u<<_IOMMUF_writable) > > > > +enum > > +{ > > Brace on the same line as "enum" please, just like for struct/union. When > they're named this helps finding the place where a certain type gets > (fully) declared. Ok. Paul > > Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |