[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] x86: allow non-BIGMEM configs to boot on >= 16Tb systems


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 7 Jun 2023 08:17:30 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=eDOjmyOzz76zmShyt9iByjI0GSvpNDj4zRniBntn140=; b=mCDX4sTodsgXdYWSfTM9Jmkb2RBlRZQBc7DpYO26VHLHFjsjsXKRRvb9/oLrUN10cNO9MnYrPjC3sfNary5o1RqTKXNenqXg7zyb3uZxlZAX2OLNg16Wo+sijFesCMhVGZCvxmhev1eX6XE8X21SVS8pt1U5yjBVcxfd9ZbvTjE+ctUuKt6cCxPSLJ2wy5mxX93DvyjvO03wldm/OAF4j1fwmMxzpJEd08yTA5QiwyBr6Qzmp8U53DtJHwn2sdF5b7M0WZlfj0XEhsx4xyamTRLpgKnJY7BRAe5HlmIk+eNOeagRy+15jK9OIslGIb8PlWoD5eE/7VYycA2W3j5N3w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Q2BIjYpeXJPMAcEwjAM10MHs/kA3r+qvsl6HYVxvmXH9zO+nNlWg3PEFrl0FGpiTFJ+zB5fNVDpsUa85WXmmpQ60789TuGPokhFRCsnRnOe0I1KECGGedmTtTQdvJJWlrCxWWHaXEaGEykokyJJ/3JPxPdKuqF2qc2zEmV9DLz6Yg9X65X9+ITSTQl4kypsGBIBg+XUTFF7HU7VLvPzCqiJ3KyaXisJEHJ1XOOi+DNEy3S3X4mmVsOToqSHbaFHu9uqo+9h0psfeijMFnZFsfIrzHbUzff5FDP8DHH0Alnb1eFstODbnWtW/PcKgKPjUpBxdPjXfyFkcdw7DcKkZxg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Wed, 07 Jun 2023 06:17:52 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

While frame table setup, directmap init, and boot allocator population
respect all intended bounds, the logic passing memory to the heap
allocator which wasn't passed to the boot allocator fails to respect
max_{pdx,pfn}. This then typically triggers the BUG() in
free_heap_pages() after checking page state, because of hitting a struct
page_info instance which was set to all ~0.

Of course all the memory above the 16Tb boundary is still going to
remain unused; using it requires BIGMEM=y. And of course this fix
similarly ought to help BIGMEM=y configurations on >= 123Tb systems
(where all the memory beyond that boundary continues to be unused).

Fixes: bac2000063ba ("x86-64: reduce range spanned by 1:1 mapping and frame 
table indexes")
Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
---
Sadly the people reporting the issue have decided to go with the 16Tb
limit, and hence the patch wasn't tested by them. I thought that I'd
still post it, though.

The "must not be passed to the boot allocator" for the range in question
may already not be applicable anymore, with all page tables now being
mapped via map_domain_page() (iirc this work has been completed). But of
course there would be a risk that something else is/was overlooked, and
hence the offending code is being fixed rather than purged (and the
purging should occur once the directmap is properly gone). (This also
seems preferable for potential backports of this change.)

--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -1722,15 +1722,16 @@ void __init noreturn __start_xen(unsigne
 
     if ( max_page - 1 > virt_to_mfn(HYPERVISOR_VIRT_END - 1) )
     {
-        unsigned long limit = virt_to_mfn(HYPERVISOR_VIRT_END - 1);
+        unsigned long lo = virt_to_mfn(HYPERVISOR_VIRT_END - 1);
+        unsigned long hi = pdx_to_pfn(max_pdx - 1) + 1;
         uint64_t mask = PAGE_SIZE - 1;
 
         if ( !highmem_start )
-            xenheap_max_mfn(limit);
+            xenheap_max_mfn(lo);
 
         end_boot_allocator();
 
-        /* Pass the remaining memory to the allocator. */
+        /* Pass the remaining memory in the (lo, hi) range to the allocator. */
         for ( i = 0; i < boot_e820.nr_map; i++ )
         {
             uint64_t s, e;
@@ -1739,10 +1740,12 @@ void __init noreturn __start_xen(unsigne
                 continue;
             s = (boot_e820.map[i].addr + mask) & ~mask;
             e = (boot_e820.map[i].addr + boot_e820.map[i].size) & ~mask;
-            if ( PFN_DOWN(e) <= limit )
+            if ( PFN_DOWN(e) <= lo || PFN_DOWN(s) >= hi )
                 continue;
-            if ( PFN_DOWN(s) <= limit )
-                s = pfn_to_paddr(limit + 1);
+            if ( PFN_DOWN(s) <= lo )
+                s = pfn_to_paddr(lo + 1);
+            if ( PFN_DOWN(e) > hi )
+                e = pfn_to_paddr(hi);
             init_domheap_pages(s, e);
         }
     }



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.