[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v4 16/21] VT-d: free all-empty page tables
> From: Jan Beulich <jbeulich@xxxxxxxx> > Sent: Monday, April 25, 2022 4:43 PM > > When a page table ends up with no present entries left, it can be > replaced by a non-present entry at the next higher level. The page table > itself can then be scheduled for freeing. > > Note that while its output isn't used there yet, > pt_update_contig_markers() right away needs to be called in all places > where entries get updated, not just the one where entries get cleared. > > Note further that while pt_update_contig_markers() updates perhaps > several PTEs within the table, since these are changes to "avail" bits > only I do not think that cache flushing would be needed afterwards. Such > cache flushing (of entire pages, unless adding yet more logic to me more > selective) would be quite noticable performance-wise (very prominent > during Dom0 boot). > > Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> Reviewed-by: Kevin Tian <kevin.tian@xxxxxxxxx> > --- > v4: Re-base over changes earlier in the series. > v3: Properly bound loop. Re-base over changes earlier in the series. > v2: New. > --- > The hang during boot on my Latitude E6410 (see the respective code > comment) was pretty close after iommu_enable_translation(). No errors, > no watchdog would kick in, just sometimes the first few pixel lines of > the next log message's (XEN) prefix would have made it out to the screen > (and there's no serial there). It's been a lot of experimenting until I > figured the workaround (which I consider ugly, but halfway acceptable). > I've been trying hard to make sure the workaround wouldn't be masking a > real issue, yet I'm still wary of it possibly doing so ... My best guess > at this point is that on these old IOMMUs the ignored bits 52...61 > aren't really ignored for present entries, but also aren't "reserved" > enough to trigger faults. This guess is from having tried to set other > bits in this range (unconditionally, and with the workaround here in > place), which yielded the same behavior. > > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -43,6 +43,9 @@ > #include "vtd.h" > #include "../ats.h" > > +#define CONTIG_MASK DMA_PTE_CONTIG_MASK > +#include <asm/pt-contig-markers.h> > + > /* dom_io is used as a sentinel for quarantined devices */ > #define QUARANTINE_SKIP(d, pgd_maddr) ((d) == dom_io && !(pgd_maddr)) > #define DEVICE_DOMID(d, pdev) ((d) != dom_io ? (d)->domain_id \ > @@ -405,6 +408,9 @@ static uint64_t addr_to_dma_page_maddr(s > > write_atomic(&pte->val, new_pte.val); > iommu_sync_cache(pte, sizeof(struct dma_pte)); > + pt_update_contig_markers(&parent->val, > + address_level_offset(addr, level), > + level, PTE_kind_table); > } > > if ( --level == target ) > @@ -837,9 +843,31 @@ static int dma_pte_clear_one(struct doma > > old = *pte; > dma_clear_pte(*pte); > + iommu_sync_cache(pte, sizeof(*pte)); > + > + while ( pt_update_contig_markers(&page->val, > + address_level_offset(addr, level), > + level, PTE_kind_null) && > + ++level < min_pt_levels ) > + { > + struct page_info *pg = maddr_to_page(pg_maddr); > + > + unmap_vtd_domain_page(page); > + > + pg_maddr = addr_to_dma_page_maddr(domain, addr, level, > flush_flags, > + false); > + BUG_ON(pg_maddr < PAGE_SIZE); > + > + page = map_vtd_domain_page(pg_maddr); > + pte = &page[address_level_offset(addr, level)]; > + dma_clear_pte(*pte); > + iommu_sync_cache(pte, sizeof(*pte)); > + > + *flush_flags |= IOMMU_FLUSHF_all; > + iommu_queue_free_pgtable(hd, pg); > + } > > spin_unlock(&hd->arch.mapping_lock); > - iommu_sync_cache(pte, sizeof(struct dma_pte)); > > unmap_vtd_domain_page(page); > > @@ -2182,8 +2210,21 @@ static int __must_check cf_check intel_i > } > > *pte = new; > - > iommu_sync_cache(pte, sizeof(struct dma_pte)); > + > + /* > + * While the (ab)use of PTE_kind_table here allows to save some work in > + * the function, the main motivation for it is that it avoids a so far > + * unexplained hang during boot (while preparing Dom0) on a Westmere > + * based laptop. > + */ > + pt_update_contig_markers(&page->val, > + address_level_offset(dfn_to_daddr(dfn), level), > + level, > + (hd->platform_ops->page_sizes & > + (1UL << level_to_offset_bits(level + 1)) > + ? PTE_kind_leaf : PTE_kind_table)); > + > spin_unlock(&hd->arch.mapping_lock); > unmap_vtd_domain_page(page); >
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |