[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH v4 1/6] x86/PVH: improve Dom0 memory size calculation


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 29 Sep 2021 15:13:24 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=K8riSR7WaBkT+h9mazx+moSRxeE0eWzN0eTXIGip8o4=; b=CZLJkZGrUrGUA7F0b4tN8oglObcB+ep+pphNA8ICHT7E5JbUGr+ZvolYTVaIVhqf2I+UcPMy2USGRGk1Dn1FjHTu1JWZfsVhgh8EJOeHApKz2wHdrEifCpacwL/prV8WbVQfDs7LdCJvCarA9sJLgQFkd10fnevHBO4EO/z9WAICRh/CJn+fcPrCJ/H+AuIa4pj9voTGTSfw47dlKj0apkkeO90nYq9sDugLPdTttRGfjyFmh6cvPKxUXchmtMesoTXLkD4z+w71Pt4lhkWTZdnvuk691q0EDa2VqivS4dtszccgSmoVqrs40b3k0JjGuVkCsmdpX51IaiZHrt9pUg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fW7nLgIeU3yeeIdJ7tE9Neir+vYLBsKSxsSDUYI8hIhg2wgojB0uXfLF1gJW1NArR80ZBwNlJ5A1zM1hbwRyo2i8ka0wOMkoHI9JzcNqEMmyAi/kAMioV6WJeM+bFRtFt1hl3BwFsyEyOeNcUq0n7U/NlPDJmDibiuHUARR9aVP5ACNvNuWQxJFOT3wUNplXOIjwYQV0v5uC+ozGnL3lVugffchRA66/5wSnLmUqaTyB5RKTXv4SYt6dmYuwQ6SFTU64be1SXJ2BHPVTKTXV93Q8gyQKtwH147+PC+lhsBPUFyB+y4XPfofdzmXNz/QRQFFLZJix9ZlOsrU9YxlZrw==
  • Authentication-results: citrix.com; dkim=none (message not signed) header.d=none;citrix.com; dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Wed, 29 Sep 2021 13:13:32 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Assuming that the accounting for IOMMU page tables will also take care
of the P2M needs was wrong: dom0_paging_pages() can determine a far
higher value, high enough for the system to run out of memory while
setting up Dom0. Hence in the case of shared page tables the larger of
the two values needs to be used (without shared page tables the sum of
both continues to be applicable).

To not further complicate the logic, eliminate the up-to-2-iteration
loop in favor of doing a few calculations twice (before and after
calling dom0_paging_pages()). While this will lead to slightly too high
a value in "cpu_pages", it is deemed better to account a few too many
than a few too little.

Also uniformly use paging_mode_enabled(), not is_hvm_domain().

While there also account for two further aspects in the PV case: With
"iommu=dom0-passthrough" no IOMMU page tables would get allocated, so
none need accounting for. And if shadow mode is to be enabled, setting
aside a suitable amount for the P2M pool to get populated is also
necessary (i.e. similar to the non-shared-page-tables case of PVH).

Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
---
I wonder whether this isn't enough to drop the "PVH dom0 without
dom0_mem" warning.

--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -318,8 +318,7 @@ unsigned long __init dom0_compute_nr_pag
     struct domain *d, struct elf_dom_parms *parms, unsigned long initrd_len)
 {
     nodeid_t node;
-    unsigned long avail = 0, nr_pages, min_pages, max_pages;
-    bool need_paging;
+    unsigned long avail = 0, nr_pages, min_pages, max_pages, iommu_pages = 0;
 
     /* The ordering of operands is to work around a clang5 issue. */
     if ( CONFIG_DOM0_MEM[0] && !dom0_mem_set )
@@ -337,53 +336,65 @@ unsigned long __init dom0_compute_nr_pag
         avail -= d->max_vcpus - 1;
 
     /* Reserve memory for iommu_dom0_init() (rough estimate). */
-    if ( is_iommu_enabled(d) )
+    if ( is_iommu_enabled(d) && !iommu_hwdom_passthrough )
     {
         unsigned int s;
 
         for ( s = 9; s < BITS_PER_LONG; s += 9 )
-            avail -= max_pdx >> s;
+            iommu_pages += max_pdx >> s;
+
+        avail -= iommu_pages;
+    }
+
+    nr_pages = get_memsize(&dom0_size, avail);
+
+    /*
+     * If allocation isn't specified, reserve 1/16th of available memory for
+     * things like DMA buffers. This reservation is clamped to a maximum of
+     * 128MB.
+     */
+    if ( !nr_pages )
+    {
+        nr_pages = avail - (pv_shim ? pv_shim_mem(avail)
+                            : min(avail / 16, 128UL << (20 - PAGE_SHIFT)));
+        if ( paging_mode_enabled(d) )
+            /*
+             * Temporary workaround message until internal (paging) memory
+             * accounting required to build a pvh dom0 is improved.
+             */
+            printk("WARNING: PVH dom0 without dom0_mem set is still unstable. "
+                   "If you get crashes during boot, try adding a dom0_mem 
parameter\n");
     }
 
-    need_paging = is_hvm_domain(d) &&
-        (!iommu_use_hap_pt(d) || !paging_mode_hap(d));
-    for ( ; ; need_paging = false )
+    if ( paging_mode_enabled(d) || opt_dom0_shadow )
     {
-        nr_pages = get_memsize(&dom0_size, avail);
-        min_pages = get_memsize(&dom0_min_size, avail);
-        max_pages = get_memsize(&dom0_max_size, avail);
+        unsigned long cpu_pages;
 
         /*
-         * If allocation isn't specified, reserve 1/16th of available memory
-         * for things like DMA buffers. This reservation is clamped to a
-         * maximum of 128MB.
+         * Clamp according to min/max limits and available memory
+         * (preliminary).
          */
-        if ( !nr_pages )
-        {
-            nr_pages = avail - (pv_shim ? pv_shim_mem(avail)
-                                 : min(avail / 16, 128UL << (20 - 
PAGE_SHIFT)));
-            if ( is_hvm_domain(d) && !need_paging )
-                /*
-                 * Temporary workaround message until internal (paging) memory
-                 * accounting required to build a pvh dom0 is improved.
-                 */
-                printk("WARNING: PVH dom0 without dom0_mem set is still 
unstable. "
-                       "If you get crashes during boot, try adding a dom0_mem 
parameter\n");
-        }
-
-
-        /* Clamp according to min/max limits and available memory. */
-        nr_pages = max(nr_pages, min_pages);
-        nr_pages = min(nr_pages, max_pages);
+        nr_pages = max(nr_pages, get_memsize(&dom0_min_size, avail));
+        nr_pages = min(nr_pages, get_memsize(&dom0_max_size, avail));
         nr_pages = min(nr_pages, avail);
 
-        if ( !need_paging )
-            break;
+        cpu_pages = dom0_paging_pages(d, nr_pages);
 
-        /* Reserve memory for shadow or HAP. */
-        avail -= dom0_paging_pages(d, nr_pages);
+        if ( !iommu_use_hap_pt(d) )
+            avail -= cpu_pages;
+        else if ( cpu_pages > iommu_pages )
+            avail -= cpu_pages - iommu_pages;
     }
 
+    nr_pages = get_memsize(&dom0_size, avail);
+    min_pages = get_memsize(&dom0_min_size, avail);
+    max_pages = get_memsize(&dom0_max_size, avail);
+
+    /* Clamp according to min/max limits and available memory (final). */
+    nr_pages = max(nr_pages, min_pages);
+    nr_pages = min(nr_pages, max_pages);
+    nr_pages = min(nr_pages, avail);
+
     if ( is_pv_domain(d) &&
          (parms->p2m_base == UNSET_ADDR) && !memsize_gt_zero(&dom0_size) &&
          (!memsize_gt_zero(&dom0_min_size) || (nr_pages > min_pages)) )




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.