Xen project Mailing List

Re: E820 memory allocation issue on Threadripper platforms

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Wed, 17 Jan 2024 11:13:44 +0100

Cc: Patrick Plenefisch <simonpatp@xxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Juergen Gross <jgross@xxxxxxxx>

Delivery-date: Wed, 17 Jan 2024 10:13:50 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Jan 17, 2024 at 09:46:27AM +0100, Jan Beulich wrote: > On 17.01.2024 07:12, Patrick Plenefisch wrote: > > On Tue, Jan 16, 2024 at 4:33 AM Jan Beulich <jbeulich@xxxxxxxx> wrote: > >> On 16.01.2024 01:22, Patrick Plenefisch wrote: > >> It remains to be seen in how far it is reasonably possible to work > >> around this in the kernel. While (sadly) still unsupported, in the > >> meantime you may want to consider running Dom0 in PVH mode. > >> > > > > I tried this by adding dom0=pvh, and instead got this boot error: > > > > (XEN) xenoprof: Initialization failed. AMD processor family 25 is not > > supported > > (XEN) NX (Execute Disable) protection active > > (XEN) Dom0 has maximum 1400 PIRQs > > (XEN) *** Building a PVH Dom0 *** > > (XEN) Failed to load kernel: -1 > > (XEN) Xen dom0 kernel broken ELF: <NULL> > > (XEN) Failed to load Dom0 kernel > > (XEN) > > (XEN) **************************************** > > (XEN) Panic on CPU 0: > > (XEN) Could not construct domain 0 > > (XEN) **************************************** > > (XEN) > > (XEN) Reboot in five seconds... > > Hmm, that's sad. The more that the error messages aren't really > informative. You did check though that your kernel is PVH-capable? > (With a debug build of Xen, and with suitably high logging level, > various of the ELF properties would be logged. Such output may or > may not give further hints towards what's actually wrong. Albeit > you using 4.17 this would further require you to pull in commit > ea3dabfb80d7 ["x86/PVH: allow Dom0 ELF parsing to be verbose"].) > > But wait - aren't you running into the same collision there with > that memory region? I think that explains the unhelpful output. I think so, elf_memcpy() in elf_load_image() is failing to load on the given destination address. Error messages should be more helpful there. > Whereas I assume the native kernel can deal with that as long as > it's built with CONFIG_RELOCATABLE=y. I don't think we want to > get into the business of interpreting the kernel's internal > representation of the relocations needed, so it's not really > clear to me what we might do in such a case. Perhaps the only way > is to signal to the kernel that it needs to apply relocations > itself (which in turn would require the kernel to signal to us > that it's capable of doing so). Cc-ing Roger in case he has any > neat idea. Hm, no, not really. We could do like multiboot2: the kernel provides us with some placement data (min/max addresses, alignment), and Xen let's the kernel deal with relocations itself. Additionally we could support the kernel providing a section with the relocations and apply them from Xen, but that's likely hm, complicated at best, as I don't even know which kinds of relocations we would have to support. I'm not sure how Linux deals with this in the bare metal case, are relocations done after decompressing and before jumping into the entry point? I would also need to check FreeBSD at least to have an idea of how it's done there. Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.