[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: E820 memory allocation issue on Threadripper platforms



On Wed, Jan 17, 2024 at 09:46:27AM +0100, Jan Beulich wrote:
> On 17.01.2024 07:12, Patrick Plenefisch wrote:
> > On Tue, Jan 16, 2024 at 4:33 AM Jan Beulich <jbeulich@xxxxxxxx> wrote:
> >> On 16.01.2024 01:22, Patrick Plenefisch wrote:
> >> It remains to be seen in how far it is reasonably possible to work
> >> around this in the kernel. While (sadly) still unsupported, in the
> >> meantime you may want to consider running Dom0 in PVH mode.
> >>
> > 
> > I tried this by adding dom0=pvh, and instead got this boot error:
> > 
> > (XEN) xenoprof: Initialization failed. AMD processor family 25 is not
> > supported
> > (XEN) NX (Execute Disable) protection active
> > (XEN) Dom0 has maximum 1400 PIRQs
> > (XEN) *** Building a PVH Dom0 ***
> > (XEN) Failed to load kernel: -1
> > (XEN) Xen dom0 kernel broken ELF: <NULL>
> > (XEN) Failed to load Dom0 kernel
> > (XEN)
> > (XEN) ****************************************
> > (XEN) Panic on CPU 0:
> > (XEN) Could not construct domain 0
> > (XEN) ****************************************
> > (XEN)
> > (XEN) Reboot in five seconds...
> 
> Hmm, that's sad. The more that the error messages aren't really
> informative. You did check though that your kernel is PVH-capable?
> (With a debug build of Xen, and with suitably high logging level,
> various of the ELF properties would be logged. Such output may or
> may not give further hints towards what's actually wrong. Albeit
> you using 4.17 this would further require you to pull in commit
> ea3dabfb80d7 ["x86/PVH: allow Dom0 ELF parsing to be verbose"].)
> 
> But wait - aren't you running into the same collision there with
> that memory region? I think that explains the unhelpful output.

I think so, elf_memcpy() in elf_load_image() is failing to load on the
given destination address.  Error messages should be more helpful
there.

> Whereas I assume the native kernel can deal with that as long as
> it's built with CONFIG_RELOCATABLE=y. I don't think we want to
> get into the business of interpreting the kernel's internal
> representation of the relocations needed, so it's not really
> clear to me what we might do in such a case. Perhaps the only way
> is to signal to the kernel that it needs to apply relocations
> itself (which in turn would require the kernel to signal to us
> that it's capable of doing so). Cc-ing Roger in case he has any
> neat idea.

Hm, no, not really.

We could do like multiboot2: the kernel provides us with some
placement data (min/max addresses, alignment), and Xen let's the
kernel deal with relocations itself.

Additionally we could support the kernel providing a section with the
relocations and apply them from Xen, but that's likely hm, complicated
at best, as I don't even know which kinds of relocations we would have
to support.

I'm not sure how Linux deals with this in the bare metal case, are
relocations done after decompressing and before jumping into the entry
point?

I would also need to check FreeBSD at least to have an idea of how
it's done there.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.