[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v3 22/23] VT-d: replace all-contiguous page tables by superpage mappings


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 10 Jan 2022 17:38:05 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tiK4Py7T+iWHZozg71IXWaPgjEgLZSxpxNyWGT8sGvA=; b=TBal6CY2RW/v1FhhLKzJjHqf5adAhDC+CRfHqkwQCOuntr6f9LS6gpNIjLacttMXYvF6KhgYBv2xQPEECiZ1H5KGKpo03JgvxA0is6iA2o8ZcI3SYDGBDVTcbJqoV1zihxS0++ZegcY1pMzbNqucEHhqdC4cwDrmL9zg+X7JGV1gJZJwaVWzWkeRhi5YOoTfXe7uXcLugYkTtgSsf505xkME5O44LSDN9XiA222pywCL8uWlw3RO519BVWIF3GXLX0kjM1cWW9Zb1Ua17gbu8IFZuRVS2AcW559hSYbN+CtPkxbSfcWIvRXlL2x1NQmk1gl2+n4sz14aOcAcudJ0lg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=kOui2ndF3KBUf9/TkXp6ob02hXF+LCN8R4DilXupRSz0AXjBgPo/T2mmuSaLKeE4x22tudJi4MFKFglVh/nRR8WLFSpmE8H4A/8h+GPICaubQESR9sC8/Vikt5ErTFAlOfH6bl2V0g4umhw/EafCDxkX0zzdhCVqrDt+sHKq7UKsI+AYYiqI2Aw0MeuAfCcJUj6wVjIRQfbjGX4g7WgD0LLakunIW6oUkdNLJ+GxT3N5eEl+I4JIQt+y/4Y9DwJB8ldvlI8Uxzml4+5B85GTsJ8C2q5woAGage/7QmhmyN9l+lIHQh4tC/CVbS3+EUjEBFQz8F9i9xBMDmtTlvw4vw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>
  • Delivery-date: Mon, 10 Jan 2022 16:38:12 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

When a page table ends up with all contiguous entries (including all
identical attributes), it can be replaced by a superpage entry at the
next higher level. The page table itself can then be scheduled for
freeing.

The adjustment to LEVEL_MASK is merely to avoid leaving a latent trap
for whenever we (and obviously hardware) start supporting 512G mappings.

Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
---
Unlike the freeing of all-empty page tables, this causes quite a bit of
back and forth for PV domains, due to their mapping/unmapping of pages
when they get converted to/from being page tables. It may therefore be
worth considering to delay re-coalescing a little, to avoid doing so
when the superpage would otherwise get split again pretty soon. But I
think this would better be the subject of a separate change anyway.

Of course this could also be helped by more "aware" kernel side
behavior: They could avoid immediately mapping freed page tables
writable again, in anticipation of re-using that same page for another
page table elsewhere.
---
v3: New.

--- a/xen/drivers/passthrough/vtd/iommu.c
+++ b/xen/drivers/passthrough/vtd/iommu.c
@@ -2071,14 +2071,35 @@ static int __must_check intel_iommu_map_
      * While the (ab)use of PTE_kind_table here allows to save some work in
      * the function, the main motivation for it is that it avoids a so far
      * unexplained hang during boot (while preparing Dom0) on a Westmere
-     * based laptop.
+     * based laptop.  This also has the intended effect of terminating the
+     * loop when super pages aren't supported anymore at the next level.
      */
-    pt_update_contig_markers(&page->val,
-                             address_level_offset(dfn_to_daddr(dfn), level),
-                             level,
-                             (hd->platform_ops->page_sizes &
-                              (1UL << level_to_offset_bits(level + 1))
-                              ? PTE_kind_leaf : PTE_kind_table));
+    while ( pt_update_contig_markers(&page->val,
+                                     address_level_offset(dfn_to_daddr(dfn), 
level),
+                                     level,
+                                     (hd->platform_ops->page_sizes &
+                                      (1UL << level_to_offset_bits(level + 1))
+                                       ? PTE_kind_leaf : PTE_kind_table)) )
+    {
+        struct page_info *pg = maddr_to_page(pg_maddr);
+
+        unmap_vtd_domain_page(page);
+
+        new.val &= ~(LEVEL_MASK << level_to_offset_bits(level));
+        dma_set_pte_superpage(new);
+
+        pg_maddr = addr_to_dma_page_maddr(d, dfn_to_daddr(dfn), ++level,
+                                          flush_flags, false);
+        BUG_ON(pg_maddr < PAGE_SIZE);
+
+        page = map_vtd_domain_page(pg_maddr);
+        pte = &page[address_level_offset(dfn_to_daddr(dfn), level)];
+        *pte = new;
+        iommu_sync_cache(pte, sizeof(*pte));
+
+        *flush_flags |= IOMMU_FLUSHF_modified | IOMMU_FLUSHF_all;
+        iommu_queue_free_pgtable(d, pg);
+    }
 
     spin_unlock(&hd->arch.mapping_lock);
     unmap_vtd_domain_page(page);
--- a/xen/drivers/passthrough/vtd/iommu.h
+++ b/xen/drivers/passthrough/vtd/iommu.h
@@ -229,7 +229,7 @@ struct context_entry {
 
 /* page table handling */
 #define LEVEL_STRIDE       (9)
-#define LEVEL_MASK         ((1 << LEVEL_STRIDE) - 1)
+#define LEVEL_MASK         (PTE_NUM - 1UL)
 #define PTE_NUM            (1 << LEVEL_STRIDE)
 #define level_to_agaw(val) ((val) - 2)
 #define agaw_to_level(val) ((val) + 2)




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.