[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ACPI NVS range conflicting with Dom0 page tables (or kernel image)



On Tue, Aug 06, 2024 at 04:12:32PM +0200, Jürgen Groß wrote:
> Marek,
> 
> On 17.06.24 16:03, Marek Marczykowski-Górecki wrote:
> > On Mon, Jun 17, 2024 at 01:22:37PM +0200, Jan Beulich wrote:
> > > Hello,
> > > 
> > > while it feels like we had a similar situation before, I can't seem to be
> > > able to find traces thereof, or associated (Linux) commits.
> > 
> > Is it some AMD Threadripper system by a chance? Previous thread on this
> > issue:
> > https://lore.kernel.org/xen-devel/CAOCpoWdOH=xGxiQSC1c5Ueb1THxAjH4WiZbCZq-QT+d_KAk3SA@xxxxxxxxxxxxxx/
> > 
> > > With
> > > 
> > > (XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x4000000
> > > ...
> > > (XEN)  Dom0 alloc.:   0000000440000000->0000000448000000 (619175 pages to 
> > > be allocated)
> > > ...
> > > (XEN)  Loaded kernel: ffffffff81000000->ffffffff84000000
> > > 
> > > the kernel occupies the space from 16Mb to 64Mb in the initial allocation.
> > > Page tables come (almost) directly above:
> > > 
> > > (XEN)  Page tables:   ffffffff84001000->ffffffff84026000
> > > 
> > > I.e. they're just above the 64Mb boundary. Yet sadly in the host E820 map
> > > there is
> > > 
> > > (XEN)  [0000000004000000, 0000000004009fff] (ACPI NVS)
> > > 
> > > i.e. a non-RAM range starting at 64Mb. The kernel (currently) won't 
> > > tolerate
> > > such an overlap (also if it was overlapping the kernel image, e.g. if on 
> > > the
> > > machine in question s sufficiently much larger kernel was used). Yet with 
> > > its
> > > fundamental goal of making its E820 match the host one I'm also in trouble
> > > thinking of possible solutions / workarounds. I certainly do not see Xen
> > > trying to cover for this, as the E820 map re-arrangement is purely a 
> > > kernel
> > > side decision (forward ported kernels got away without, and what e.g. the
> > > BSDs do is entirely unknown to me).
> > 
> > In Qubes we have worked around the issue by moving the kernel lower
> > (CONFIG_PHYSICAL_START=0x200000):
> > https://github.com/QubesOS/qubes-linux-kernel/commit/3e8be4ac1682370977d4d0dc1d782c428d860282
> > 
> > Far from ideal, but gets it bootable...
> > 
> 
> could you test the attached kernel patches? They should fix the issue without
> having to modify CONFIG_PHYSICAL_START.
> 
> I have tested them to boot up without problem on my test system, but I don't
> have access to a system showing the E820 map conflict you are seeing.
> 
> The patches have been developed against kernel 6.11-rc2, but I think they
> should apply to a 6.10 and maybe even an older kernel.

Sure, but tomorrow-ish.

> If possible it would be nice to verify suspend to disk still working, as
> the kernel will need to access the ACPI NVS area in this case.

That might be harder, as Qubes OS doesn't support suspend to disk, but
I'll see if something can be done.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab

Attachment: signature.asc
Description: PGP signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.