[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Discussion of Xenheap problems on AArch64



Hi Henry,

On 14/05/2021 05:35, Henry Wang wrote:
From: Julien Grall <julien@xxxxxxx>
Hi Julien,


On 11/05/2021 02:11, Henry Wang wrote:
Hi Julien,
Hi Henry,

From: Julien Grall <julien@xxxxxxx>
Hi Henry,

On 07/05/2021 05:06, Henry Wang wrote:
From: Julien Grall <julien@xxxxxxx>
On 28/04/2021 10:28, Henry Wang wrote:
[...]

when I continue booting Xen, I got following error log:

(XEN) Xen call trace:
(XEN)    [<00000000002b5a5c>] alloc_boot_pages+0x94/0x98 (PC)
(XEN)    [<00000000002ca3bc>] setup_frametable_mappings+0xa4/0x108
(LR)
(XEN)    [<00000000002ca3bc>] setup_frametable_mappings+0xa4/0x108
(XEN)    [<00000000002cb988>] start_xen+0x344/0xbcc
(XEN)    [<00000000002001c0>]
arm64/head.o#primary_switched+0x10/0x30
(XEN)
(XEN) ****************************************
(XEN) Panic on CPU 0:
(XEN) Xen BUG at page_alloc.c:432
(XEN) ****************************************

This is happening without my patch series applied, right? If so, what
happen if you apply it?

No, I am afraid this is with your patch series applied, and that is why I
am a little bit confused about the error log...

You are hitting the BUG() at the end of alloc_boot_pages(). This is hit
because the boot allocator couldn't allocate memory for your request.

Would you be able to apply the following diff and paste the output here?

Thank you, of course yes, please see below output attached :)


diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index ace6333c18ea..dbb736fdb275 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -329,6 +329,8 @@ void __init init_boot_pages(paddr_t ps, paddr_t pe)
       if ( pe <= ps )
           return;

+    printk("%s: ps %"PRI_paddr" pe %"PRI_paddr"\n", __func__, ps, pe);
                                               ^ FYI: I have to change this 
PRI_paddr to PRIpaddr
                                                  to make compiler happy

Ah yes, we don't have a variant with _. I thought compiled test before sending it :(.


+
       first_valid_mfn = mfn_min(maddr_to_mfn(ps), first_valid_mfn);

       bootmem_region_add(ps >> PAGE_SHIFT, pe >> PAGE_SHIFT);
@@ -395,6 +397,8 @@ mfn_t __init alloc_boot_pages(unsigned long nr_pfns,
unsigned long pfn_align)
       unsigned long pg, _e;
       unsigned int i = nr_bootmem_regions;

+    printk("%s: nr_pfns %lu pfn_align %lu\n", __func__, nr_pfns,
pfn_align);
+
       BUG_ON(!nr_bootmem_regions);

       while ( i-- )


I also added some printk to make sure the dtb is parsed correctly, and for the
Error case, I get following log:

Thank you for the log.


(XEN) ----------banks=2--------
(XEN) ----------start=80000000--------
(XEN) ----------size=7F000000--------
(XEN) ----------start=F900000000--------
(XEN) ----------size=80000000--------
(XEN) Checking for initrd in /chosen
(XEN) RAM: 0000000080000000 - 00000000feffffff
(XEN) RAM: 000000f900000000 - 000000f97fffffff
(XEN)
(XEN) MODULE[0]: 0000000084000000 - 00000000841464c8 Xen
(XEN) MODULE[1]: 00000000841464c8 - 0000000084148c9b Device Tree
(XEN) MODULE[2]: 0000000080080000 - 0000000081080000 Kernel
(XEN)  RESVD[0]: 0000000080000000 - 0000000080010000
(XEN)
(XEN) Command line: noreboot dom0_mem=1024M console=dtuart
dtuart=serial0 bootscrub=0
(XEN) PFN compression on bits 21...22
(XEN) init_boot_pages: ps 0000000080010000 pe 0000000080080000

The size of this region is 448MB.

(XEN) init_boot_pages: ps 0000000081080000 pe 0000000084000000

The size of this region is 47MB.

(XEN) init_boot_pages: ps 0000000084149000 pe 00000000ff000000

The size of this region is 1966MB.


(XEN) alloc_boot_pages: nr_pfns 1 pfn_align 1
(XEN) alloc_boot_pages: nr_pfns 1 pfn_align 1
(XEN) alloc_boot_pages: nr_pfns 1 pfn_align 1
(XEN) init_boot_pages: ps 000000f900000000 pe 000000f980000000

The size of this region is 2048MB.

(XEN) alloc_boot_pages: nr_pfns 909312 pfn_align 8192

This is asking for 3552MB of contiguous memory which cannot be accommodated. In any case, this is quite a large region to ask.

Same...

(XEN) Xen BUG at page_alloc.c:436

To compare with the maximum start address (f800000000) of second part mem
where xen boots correctly, I also attached the log for your information:

(XEN) ----------banks=2--------
(XEN) ----------start=80000000--------
(XEN) ----------size=7F000000--------
(XEN) ----------start=F800000000--------
(XEN) ----------size=80000000--------
(XEN) Checking for initrd in /chosen
(XEN) RAM: 0000000080000000 - 00000000feffffff
(XEN) RAM: 000000f800000000 - 000000f87fffffff
(XEN)
(XEN) MODULE[0]: 0000000084000000 - 00000000841464c8 Xen
(XEN) MODULE[1]: 00000000841464c8 - 0000000084148c9b Device Tree
(XEN) MODULE[2]: 0000000080080000 - 0000000081080000 Kernel
(XEN)  RESVD[0]: 0000000080000000 - 0000000080010000
(XEN)
(XEN) Command line: noreboot dom0_mem=1024M console=dtuart
dtuart=serial0 bootscrub=0
(XEN) PFN compression on bits 20...22
(XEN) init_boot_pages: ps 0000000080010000 pe 0000000080080000
(XEN) init_boot_pages: ps 0000000081080000 pe 0000000084000000
(XEN) init_boot_pages: ps 0000000084149000 pe 00000000ff000000
(XEN) alloc_boot_pages: nr_pfns 1 pfn_align 1
(XEN) alloc_boot_pages: nr_pfns 1 pfn_align 1
(XEN) alloc_boot_pages: nr_pfns 1 pfn_align 1
(XEN) init_boot_pages: ps 000000f800000000 pe 000000f880000000
(XEN) alloc_boot_pages: nr_pfns 450560 pfn_align 8192

... here. We are trying to allocate a 1.5GB frametable. You have only 4GB of memory so the frametable should be a lot smaller (few tens of MB).

This is happening because PDX is not able to find many bits to compress.
I am not sure we can compress more with the current PDX algorithm. This may require some extensive improvement to reduce the footprint.

On a previous e-mail, you said you tweaked the FVP model to set those regions. Were you trying to mimick the memory layout of a real HW (either current or future)?

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.