|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 15/18] IOMMU/x86: prefill newly allocate page tables
On Fri, Sep 24, 2021 at 11:54:58AM +0200, Jan Beulich wrote:
> Page table are used for two purposes after allocation: They either start
> out all empty, or they get filled to replace a superpage. Subsequently,
> to replace all empty or fully contiguous page tables, contiguous sub-
> regions will be recorded within individual page tables. Install the
> initial set of markers immediately after allocation. Make sure to retain
> these markers when further populating a page table in preparation for it
> to replace a superpage.
>
> The markers are simply 4-bit fields holding the order value of
> contiguous entries. To demonstrate this, if a page table had just 16
> entries, this would be the initial (fully contiguous) set of markers:
>
> index 0 1 2 3 4 5 6 7 8 9 A B C D E F
> marker 4 0 1 0 2 0 1 0 3 0 1 0 2 0 1 0
>
> "Contiguous" here means not only present entries with successively
> increasing MFNs, each one suitably aligned for its slot, but also a
> respective number of all non-present entries.
>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
Obviously this marker only works for newly created page tables right
now, the moment we start poking holes or replacing entries the marker
is not updated anymore. I expect further patches will expand on
this.
> ---
> An alternative to the ASSERT()s added to set_iommu_ptes_present() would
> be to make the function less general-purpose; it's used in a single
> place only after all (i.e. it might as well be folded into its only
> caller).
> ---
> v2: New.
>
> --- a/xen/drivers/passthrough/amd/iommu-defs.h
> +++ b/xen/drivers/passthrough/amd/iommu-defs.h
> @@ -445,6 +445,8 @@ union amd_iommu_x2apic_control {
> #define IOMMU_PAGE_TABLE_U32_PER_ENTRY (IOMMU_PAGE_TABLE_ENTRY_SIZE /
> 4)
> #define IOMMU_PAGE_TABLE_ALIGNMENT 4096
>
> +#define IOMMU_PTE_CONTIG_MASK 0x1e /* The ign0 field below. */
Should you rename ign0 to contig_mask or some such now?
Same would apply to the comment next to dma_pte for VT-d, where bits
52:62 are ignored (the comments seems to be missing this already) and
we will be using bits 52:55 to store the contiguous mask for the
entry.
> +
> union amd_iommu_pte {
> uint64_t raw;
> struct {
> --- a/xen/drivers/passthrough/amd/iommu_map.c
> +++ b/xen/drivers/passthrough/amd/iommu_map.c
> @@ -116,7 +116,19 @@ static void set_iommu_ptes_present(unsig
>
> while ( nr_ptes-- )
> {
> - set_iommu_pde_present(pde, next_mfn, 0, iw, ir);
> + ASSERT(!pde->next_level);
> + ASSERT(!pde->u);
> +
> + if ( pde > table )
> + ASSERT(pde->ign0 == find_first_set_bit(pde - table));
> + else
> + ASSERT(pde->ign0 == PAGE_SHIFT - 3);
You could even special case (pde - table) % 2 != 0, but this is debug
only code, and it's possible a mod is more costly than
find_first_set_bit.
> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -433,12 +433,12 @@ int iommu_free_pgtables(struct domain *d
> return 0;
> }
>
> -struct page_info *iommu_alloc_pgtable(struct domain *d)
> +struct page_info *iommu_alloc_pgtable(struct domain *d, uint64_t contig_mask)
> {
> struct domain_iommu *hd = dom_iommu(d);
> unsigned int memflags = 0;
> struct page_info *pg;
> - void *p;
> + uint64_t *p;
>
> #ifdef CONFIG_NUMA
> if ( hd->node != NUMA_NO_NODE )
> @@ -450,7 +450,28 @@ struct page_info *iommu_alloc_pgtable(st
> return NULL;
>
> p = __map_domain_page(pg);
> - clear_page(p);
> +
> + if ( contig_mask )
> + {
> + unsigned int i, shift = find_first_set_bit(contig_mask);
> +
> + ASSERT(((PAGE_SHIFT - 3) & (contig_mask >> shift)) == PAGE_SHIFT -
> 3);
> +
> + p[0] = (PAGE_SHIFT - 3ull) << shift;
> + p[1] = 0;
> + p[2] = 1ull << shift;
> + p[3] = 0;
> +
> + for ( i = 4; i < PAGE_SIZE / 8; i += 4 )
> + {
> + p[i + 0] = (find_first_set_bit(i) + 0ull) << shift;
> + p[i + 1] = 0;
> + p[i + 2] = 1ull << shift;
> + p[i + 3] = 0;
> + }
You could likely do:
for ( i = 0; i < PAGE_SIZE / 8; i += 4 )
{
p[i + 0] = i ? ((find_first_set_bit(i) + 0ull) << shift)
: ((PAGE_SHIFT - 3ull) << shift);
p[i + 1] = 0;
p[i + 2] = 1ull << shift;
p[i + 3] = 0;
}
To avoid having to open code the first loop iteration. The ternary
operator could also be nested before the shift, but I find that
harder to read.
> + }
> + else
> + clear_page(p);
>
> if ( hd->platform_ops->sync_cache )
> iommu_vcall(hd->platform_ops, sync_cache, p, PAGE_SIZE);
> --- a/xen/include/asm-x86/iommu.h
> +++ b/xen/include/asm-x86/iommu.h
> @@ -142,7 +142,8 @@ int pi_update_irte(const struct pi_desc
> })
>
> int __must_check iommu_free_pgtables(struct domain *d);
> -struct page_info *__must_check iommu_alloc_pgtable(struct domain *d);
> +struct page_info *__must_check iommu_alloc_pgtable(struct domain *d,
> + uint64_t contig_mask);
> void iommu_queue_free_pgtable(struct domain *d, struct page_info *pg);
>
> #endif /* !__ARCH_X86_IOMMU_H__ */
>
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |