[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ACPI NVS range conflicting with Dom0 page tables (or kernel image)



On 06.08.24 17:21, Marek Marczykowski-Górecki wrote:
On Tue, Aug 06, 2024 at 04:12:32PM +0200, Jürgen Groß wrote:
Marek,

On 17.06.24 16:03, Marek Marczykowski-Górecki wrote:
On Mon, Jun 17, 2024 at 01:22:37PM +0200, Jan Beulich wrote:
Hello,

while it feels like we had a similar situation before, I can't seem to be
able to find traces thereof, or associated (Linux) commits.

Is it some AMD Threadripper system by a chance? Previous thread on this
issue:
https://lore.kernel.org/xen-devel/CAOCpoWdOH=xGxiQSC1c5Ueb1THxAjH4WiZbCZq-QT+d_KAk3SA@xxxxxxxxxxxxxx/

With

(XEN)  Dom0 kernel: 64-bit, PAE, lsb, paddr 0x1000000 -> 0x4000000
...
(XEN)  Dom0 alloc.:   0000000440000000->0000000448000000 (619175 pages to be 
allocated)
...
(XEN)  Loaded kernel: ffffffff81000000->ffffffff84000000

the kernel occupies the space from 16Mb to 64Mb in the initial allocation.
Page tables come (almost) directly above:

(XEN)  Page tables:   ffffffff84001000->ffffffff84026000

I.e. they're just above the 64Mb boundary. Yet sadly in the host E820 map
there is

(XEN)  [0000000004000000, 0000000004009fff] (ACPI NVS)

i.e. a non-RAM range starting at 64Mb. The kernel (currently) won't tolerate
such an overlap (also if it was overlapping the kernel image, e.g. if on the
machine in question s sufficiently much larger kernel was used). Yet with its
fundamental goal of making its E820 match the host one I'm also in trouble
thinking of possible solutions / workarounds. I certainly do not see Xen
trying to cover for this, as the E820 map re-arrangement is purely a kernel
side decision (forward ported kernels got away without, and what e.g. the
BSDs do is entirely unknown to me).

In Qubes we have worked around the issue by moving the kernel lower
(CONFIG_PHYSICAL_START=0x200000):
https://github.com/QubesOS/qubes-linux-kernel/commit/3e8be4ac1682370977d4d0dc1d782c428d860282

Far from ideal, but gets it bootable...


could you test the attached kernel patches? They should fix the issue without
having to modify CONFIG_PHYSICAL_START.

I have tested them to boot up without problem on my test system, but I don't
have access to a system showing the E820 map conflict you are seeing.

The patches have been developed against kernel 6.11-rc2, but I think they
should apply to a 6.10 and maybe even an older kernel.

Sure, but tomorrow-ish.

Thanks.


If possible it would be nice to verify suspend to disk still working, as
the kernel will need to access the ACPI NVS area in this case.

That might be harder, as Qubes OS doesn't support suspend to disk, but
I'll see if something can be done.

Thinking about it - can this ever work with Xen?


Juergen




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.