[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 02/18] VT-d: have callers specify the target level for page table walks
On Fri, Sep 24, 2021 at 11:42:13AM +0200, Jan Beulich wrote: > In order to be able to insert/remove super-pages we need to allow > callers of the walking function to specify at which point to stop the > walk. > > For intel_iommu_lookup_page() integrate the last level access into > the main walking function. > > dma_pte_clear_one() gets only partly adjusted for now: Error handling > and order parameter get put in place, but the order parameter remains > ignored (just like intel_iommu_map_page()'s order part of the flags). > > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > --- > I have to admit that I don't understand why domain_pgd_maddr() wants to > populate all page table levels for DFN 0. I think it would be enough to create up to the level requested by the caller? Seems like a lazy way to always assert that the level requested by the caller would be present. > > I was actually wondering whether it wouldn't make sense to integrate > dma_pte_clear_one() into its only caller intel_iommu_unmap_page(), for > better symmetry with intel_iommu_map_page(). > --- > v2: Fix build. > > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -264,63 +264,116 @@ static u64 bus_to_context_maddr(struct v > return maddr; > } > > -static u64 addr_to_dma_page_maddr(struct domain *domain, u64 addr, int alloc) > +/* > + * This function walks (and if requested allocates) page tables to the > + * designated target level. It returns > + * - 0 when a non-present entry was encountered and no allocation was > + * requested, > + * - a small positive value (the level, i.e. below PAGE_SIZE) upon allocation > + * failure, > + * - for target > 0 the address of the page table holding the leaf PTE for ^ physical I think it's clearer, as the return type could be ambiguous. > + * the requested address, > + * - for target == 0 the full PTE. Could this create confusion if for example one PTE maps physical page 0? As the caller when getting back a full PTE with address 0 and some of the low bits set could interpret the result as an error. I think we already had this discussion on other occasions, but I would rather add a parameter to be used as a return placeholder (ie: a *dma_pte maybe?) and use the function return value just for errors because IMO it's clearer, but I know you don't usually like this approach, so I'm not going to insist. > + */ > +static uint64_t addr_to_dma_page_maddr(struct domain *domain, daddr_t addr, > + unsigned int target, > + unsigned int *flush_flags, bool alloc) > { > struct domain_iommu *hd = dom_iommu(domain); > int addr_width = agaw_to_width(hd->arch.vtd.agaw); > struct dma_pte *parent, *pte = NULL; > - int level = agaw_to_level(hd->arch.vtd.agaw); > - int offset; > + unsigned int level = agaw_to_level(hd->arch.vtd.agaw), offset; > u64 pte_maddr = 0; > > addr &= (((u64)1) << addr_width) - 1; > ASSERT(spin_is_locked(&hd->arch.mapping_lock)); > + ASSERT(target || !alloc); Might be better to use an if with ASSERT_UNREACHABLE and return an error? (ie: level itself?) > + > if ( !hd->arch.vtd.pgd_maddr ) > { > struct page_info *pg; > > - if ( !alloc || !(pg = iommu_alloc_pgtable(domain)) ) > + if ( !alloc ) > + goto out; > + > + pte_maddr = level; > + if ( !(pg = iommu_alloc_pgtable(domain)) ) > goto out; > > hd->arch.vtd.pgd_maddr = page_to_maddr(pg); > } > > - parent = (struct dma_pte *)map_vtd_domain_page(hd->arch.vtd.pgd_maddr); > - while ( level > 1 ) > + pte_maddr = hd->arch.vtd.pgd_maddr; > + parent = map_vtd_domain_page(pte_maddr); > + while ( level > target ) > { > offset = address_level_offset(addr, level); > pte = &parent[offset]; > > pte_maddr = dma_pte_addr(*pte); > - if ( !pte_maddr ) > + if ( !dma_pte_present(*pte) || (level > 1 && > dma_pte_superpage(*pte)) ) > { > struct page_info *pg; > + /* > + * Higher level tables always set r/w, last level page table > + * controls read/write. > + */ > + struct dma_pte new_pte = { DMA_PTE_PROT }; > > if ( !alloc ) > - break; > + { > + pte_maddr = 0; > + if ( !dma_pte_present(*pte) ) > + break; > + > + /* > + * When the leaf entry was requested, pass back the full PTE, > + * with the address adjusted to account for the residual of > + * the walk. > + */ > + pte_maddr = pte->val + Wouldn't it be better to use dma_pte_addr(*pte) rather than accessing pte->val, and then you could drop the PAGE_MASK? Or is the addr parameter not guaranteed to be page aligned? > + (addr & ((1UL << level_to_offset_bits(level)) - 1) & > + PAGE_MASK); > + if ( !target ) > + break; I'm confused by the conditional break here, why would you calculate pte_maddr unconditionally to get overwritten just the line below if target != 0? Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |