[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Cannot boot PVH dom0 with big initrd


  • To: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Fri, 13 Feb 2026 21:40:31 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mondg1boiBf4RbMEGWYnzqx9QgyaN+LDudJFiBHXykU=; b=ic9ZLCSw4wGX1ORXuJBH9P5Ef4LzytKFzntR1QXnUYMMrMA8Ygf9CbSv1AMMbC0VRhHSOSySNluFBDa0JbTpxPAxn55fJ1YZa50Ce+NtrHczV2iIE12muxM++5Rsmyde8hf+SBxC8I04/UcVkf67aKUSeyxMhpzrLsste9Jy9CLOzvVB0M8UejN+P171FiNQSWzNWVukTh1HhsyD62jWp4Tl+wwZvh+XFoOtgYsCXAhQ2IIUxHOZhIxYj3lWO/97QxaFiAWQkO6poB+ygnYrQOslwigOs6UUUcBHJZ1BCxNqQIM+NZqrZdiZx2W/XmglYHSq5WFAhcE484u8L3zl4w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=NbhZmmwGz5KB0urrK+L4aNbvm062yvifhwPJ1nEhKghIAN9qEif6Wtp4YJe6/RkbzFsWeqkUa4fUVd9JVHRoKO2BsYduC72BPvyaxYRCCk1L3i4lHdQE2SdoLlMCM0Z6ImIHhZegTYGr+cQceMIts7HGOLmsNtOyFB/YluxoTszUt3YLb54cUS4BTLbjZ8sni0cpnqnqzRC/5eeTLNtkL8E5Hfih2cPsLbX3uMI2/9fa+fftjhpd0qGWr/KfEmkhy0T5JUDeuG1K1s9sacR4Tk2FoN+ZWGT8pZH6/Th/zaY5ZbKE6ewS5ReZT2Q50iK0JKXQUW085baCtIOZAFaAog==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Jan Beulich <jbeulich@xxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 13 Feb 2026 20:40:54 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Fri, Feb 13, 2026 at 04:56:39PM +0100, Roger Pau Monné wrote:
> On Fri, Feb 13, 2026 at 09:56:42AM +0100, Jan Beulich wrote:
> > On 13.02.2026 05:02, Marek Marczykowski-Górecki wrote:
> > > Hi,
> > > 
> > > After fixing the xhci crash, I hit another issue - booting with 236MB
> > > initrd doesn't work, I get:
> > > 
> > >     (XEN) [    3.151856] *** Building a PVH Dom0 ***
> > >     ...
> > >     (XEN) [    3.593940] Unable to allocate memory with order 0!
> > >     (XEN) [    3.597110] Failed to setup Dom0 physical memory map
> > >     (XEN) [    3.599884] 
> > >     (XEN) [    3.602482] ****************************************
> > >     (XEN) [    3.605272] Panic on CPU 0:
> > >     (XEN) [    3.607928] Could not construct d0
> > >     (XEN) [    3.610692] ****************************************
> > >     (XEN) [    3.613463] 
> > >     (XEN) [    3.616035] Reboot in five seconds...
> > >     (XEN) [    8.626565] Resetting with ACPI MEMORY or I/O RESET_REG.
> > > 
> > > Full console log: 
> > > https://gist.github.com/marmarek/c9dbc87bf07b76f2899781755762f565
> > > 
> > > If I skip initrd, then it boots just fine (but dom0 is not happy about
> > > that). 164MB initrd failed too, but 13MB started ok.
> > > Just in case, I tried skipping XHCI console, but it didn't change
> > > anything.
> > > 
> > > Host has 16GB of memory, and there is no dom0_mem= parameter. Xen is
> > > started from GRUB, using MB2+EFI.
> > 
> > Hmm, yes, there's an ordering issue: Of course we free initrd space (as used
> > for passing from the boot loader to Xen) only after copying to the 
> > designated
> > guest area. Yet dom0_compute_nr_pages(), intentionally, includes the space 
> > in
> > its calculation (adding initial_images_nrpages()'s return value). PV Dom0
> > isn't affected because to load huge initrd there, the kernel has to request
> > the initrd to not be mapped into the initial allocation.
> 
> Right, on PV dom0 we do not copy the image to a new set of pages, we
> simply assign the pages where the initrd resides to the domain.  We
> can't populate those pages in the p2m as-is, otherwise we would
> shatter super pages.
> 
> I think the fix below should do it, it's likely the best we can do.
> Can you please give it a try Marek?
> 
> Thanks, Roger.
> ---
> diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
> index 0b467fd4a4fc..8e3cb5d0db76 100644
> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -343,7 +343,7 @@ unsigned long __init dom0_compute_nr_pages(
>  
>      for_each_node_mask ( node, dom0_nodes )
>          avail += avail_domheap_pages_region(node, 0, 0) +
> -                 initial_images_nrpages(node);
> +                 is_pv_domain(d) ? initial_images_nrpages(node) : 0;
>  
>      /* Reserve memory for further dom0 vcpu-struct allocations... */
>      avail -= (d->max_vcpus - 1UL)

I'm working on a more complex patch, that attempts to account the
memory used by the init images towards the reserved amount that's kept
by Xen.  This should make accounting a bit better, in that we won't
end up reserving the Xen memory plus the memory used by the init
images.

It's still however a WIP, but would you mind giving it a try?

Thanks, Roger.
---
diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
index 0b467fd4a4fc..3d54af197188 100644
--- a/xen/arch/x86/dom0_build.c
+++ b/xen/arch/x86/dom0_build.c
@@ -325,10 +325,18 @@ unsigned long __init dom0_paging_pages(const struct 
domain *d,
  * If allocation isn't specified, reserve 1/16th of available memory for
  * things like DMA buffers. This reservation is clamped to a maximum of 128MB.
  */
-static unsigned long __init default_nr_pages(unsigned long avail)
+static unsigned long __init default_nr_pages(unsigned long avail,
+                                             unsigned long init_images)
 {
-    return avail - (pv_shim ? pv_shim_mem(avail)
-                            : min(avail / 16, 128UL << (20 - PAGE_SHIFT)));
+    unsigned long rsvd = min(avail / 16, 128UL << (20 - PAGE_SHIFT));
+
+    /*
+     * Account for memory consumed by initial images as if it was part of the
+     * reserved amount.
+     */
+    rsvd -= rsvd <= init_images ? rsvd : init_images;
+
+    return avail - (pv_shim ? pv_shim_mem(avail) : rsvd);
 }
 
 unsigned long __init dom0_compute_nr_pages(
@@ -336,14 +344,28 @@ unsigned long __init dom0_compute_nr_pages(
 {
     nodeid_t node;
     unsigned long avail = 0, nr_pages, min_pages, max_pages, iommu_pages = 0;
+    unsigned long init_images = 0;
 
     /* The ordering of operands is to work around a clang5 issue. */
     if ( CONFIG_DOM0_MEM[0] && !dom0_mem_set )
         parse_dom0_mem(CONFIG_DOM0_MEM);
 
     for_each_node_mask ( node, dom0_nodes )
-        avail += avail_domheap_pages_region(node, 0, 0) +
-                 initial_images_nrpages(node);
+    {
+        avail += avail_domheap_pages_region(node, 0, 0);
+        init_images += initial_images_nrpages(node);
+    }
+
+    if ( is_pv_domain(d) )
+    {
+        /*
+         * For PV domains the initrd pages are directly assigned to the
+         * guest, and hence the initrd size counts as free memory that can
+         * be used by the domain.  Set to 0 to prevent further adjustments.
+         */
+        avail += init_images;
+        init_images = 0;
+    }
 
     /* Reserve memory for further dom0 vcpu-struct allocations... */
     avail -= (d->max_vcpus - 1UL)
@@ -367,7 +389,8 @@ unsigned long __init dom0_compute_nr_pages(
     {
         unsigned long cpu_pages;
 
-        nr_pages = get_memsize(&dom0_size, avail) ?: default_nr_pages(avail);
+        nr_pages = get_memsize(&dom0_size, avail) ?:
+                   default_nr_pages(avail, init_images);
 
         /*
          * Clamp according to min/max limits and available memory
@@ -385,7 +408,8 @@ unsigned long __init dom0_compute_nr_pages(
             avail -= cpu_pages - iommu_pages;
     }
 
-    nr_pages = get_memsize(&dom0_size, avail) ?: default_nr_pages(avail);
+    nr_pages = get_memsize(&dom0_size, avail) ?:
+               default_nr_pages(avail, init_images);
     min_pages = get_memsize(&dom0_min_size, avail);
     max_pages = get_memsize(&dom0_max_size, avail);
 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.