|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [PATCH v4 16/21] VT-d: free all-empty page tables
> From: Jan Beulich <jbeulich@xxxxxxxx>
> Sent: Monday, April 25, 2022 4:43 PM
>
> When a page table ends up with no present entries left, it can be
> replaced by a non-present entry at the next higher level. The page table
> itself can then be scheduled for freeing.
>
> Note that while its output isn't used there yet,
> pt_update_contig_markers() right away needs to be called in all places
> where entries get updated, not just the one where entries get cleared.
>
> Note further that while pt_update_contig_markers() updates perhaps
> several PTEs within the table, since these are changes to "avail" bits
> only I do not think that cache flushing would be needed afterwards. Such
> cache flushing (of entire pages, unless adding yet more logic to me more
> selective) would be quite noticable performance-wise (very prominent
> during Dom0 boot).
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
Reviewed-by: Kevin Tian <kevin.tian@xxxxxxxxx>
> ---
> v4: Re-base over changes earlier in the series.
> v3: Properly bound loop. Re-base over changes earlier in the series.
> v2: New.
> ---
> The hang during boot on my Latitude E6410 (see the respective code
> comment) was pretty close after iommu_enable_translation(). No errors,
> no watchdog would kick in, just sometimes the first few pixel lines of
> the next log message's (XEN) prefix would have made it out to the screen
> (and there's no serial there). It's been a lot of experimenting until I
> figured the workaround (which I consider ugly, but halfway acceptable).
> I've been trying hard to make sure the workaround wouldn't be masking a
> real issue, yet I'm still wary of it possibly doing so ... My best guess
> at this point is that on these old IOMMUs the ignored bits 52...61
> aren't really ignored for present entries, but also aren't "reserved"
> enough to trigger faults. This guess is from having tried to set other
> bits in this range (unconditionally, and with the workaround here in
> place), which yielded the same behavior.
>
> --- a/xen/drivers/passthrough/vtd/iommu.c
> +++ b/xen/drivers/passthrough/vtd/iommu.c
> @@ -43,6 +43,9 @@
> #include "vtd.h"
> #include "../ats.h"
>
> +#define CONTIG_MASK DMA_PTE_CONTIG_MASK
> +#include <asm/pt-contig-markers.h>
> +
> /* dom_io is used as a sentinel for quarantined devices */
> #define QUARANTINE_SKIP(d, pgd_maddr) ((d) == dom_io && !(pgd_maddr))
> #define DEVICE_DOMID(d, pdev) ((d) != dom_io ? (d)->domain_id \
> @@ -405,6 +408,9 @@ static uint64_t addr_to_dma_page_maddr(s
>
> write_atomic(&pte->val, new_pte.val);
> iommu_sync_cache(pte, sizeof(struct dma_pte));
> + pt_update_contig_markers(&parent->val,
> + address_level_offset(addr, level),
> + level, PTE_kind_table);
> }
>
> if ( --level == target )
> @@ -837,9 +843,31 @@ static int dma_pte_clear_one(struct doma
>
> old = *pte;
> dma_clear_pte(*pte);
> + iommu_sync_cache(pte, sizeof(*pte));
> +
> + while ( pt_update_contig_markers(&page->val,
> + address_level_offset(addr, level),
> + level, PTE_kind_null) &&
> + ++level < min_pt_levels )
> + {
> + struct page_info *pg = maddr_to_page(pg_maddr);
> +
> + unmap_vtd_domain_page(page);
> +
> + pg_maddr = addr_to_dma_page_maddr(domain, addr, level,
> flush_flags,
> + false);
> + BUG_ON(pg_maddr < PAGE_SIZE);
> +
> + page = map_vtd_domain_page(pg_maddr);
> + pte = &page[address_level_offset(addr, level)];
> + dma_clear_pte(*pte);
> + iommu_sync_cache(pte, sizeof(*pte));
> +
> + *flush_flags |= IOMMU_FLUSHF_all;
> + iommu_queue_free_pgtable(hd, pg);
> + }
>
> spin_unlock(&hd->arch.mapping_lock);
> - iommu_sync_cache(pte, sizeof(struct dma_pte));
>
> unmap_vtd_domain_page(page);
>
> @@ -2182,8 +2210,21 @@ static int __must_check cf_check intel_i
> }
>
> *pte = new;
> -
> iommu_sync_cache(pte, sizeof(struct dma_pte));
> +
> + /*
> + * While the (ab)use of PTE_kind_table here allows to save some work in
> + * the function, the main motivation for it is that it avoids a so far
> + * unexplained hang during boot (while preparing Dom0) on a Westmere
> + * based laptop.
> + */
> + pt_update_contig_markers(&page->val,
> + address_level_offset(dfn_to_daddr(dfn), level),
> + level,
> + (hd->platform_ops->page_sizes &
> + (1UL << level_to_offset_bits(level + 1))
> + ? PTE_kind_leaf : PTE_kind_table));
> +
> spin_unlock(&hd->arch.mapping_lock);
> unmap_vtd_domain_page(page);
>
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |