[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] PV guest with PCI passthrough crash on Xen 4.8.3 inside KVM when booted through OVMF
On 27.11.23 16:56, Jason Andryuk wrote: On Mon, Nov 27, 2023 at 6:27 AM Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote:On Mon, Nov 27, 2023 at 11:20:36AM +0000, Frediano Ziglio wrote:On Sun, Nov 26, 2023 at 2:51 PM Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote:On Mon, Feb 19, 2018 at 06:30:14PM +0100, Juergen Gross wrote:On 16/02/18 20:02, Andrew Cooper wrote:On 16/02/18 18:51, Marek Marczykowski-Górecki wrote:On Fri, Feb 16, 2018 at 05:52:50PM +0000, Andrew Cooper wrote:On 16/02/18 17:48, Marek Marczykowski-Górecki wrote:Hi, As in the subject, the guest crashes on boot, before kernel output anything. I've isolated this to the conditions below: - PV guest have PCI device assigned (e1000e emulated by QEMU in this case), without PCI device it works - Xen (in KVM) is started through OVMF; with seabios it works - nested HVM is disabled in KVM - AMD IOMMU emulation is disabled in KVM; when enabled qemu crashes on boot (looks like qemu bug, unrelated to this one) Version info: - KVM host: OpenSUSE 42.3, qemu 2.9.1, ovmf-2017+git1492060560.b6d11d7c46-4.1, AMD - Xen host: Xen 4.8.3, dom0: Linux 4.14.13 - Xen domU: Linux 4.14.13, direct boot Not sure if relevant, but initially I've tried booting xen.efi /mapbs /noexitboot and then dom0 kernel crashed saying something about conflict between e820 and kernel mapping. But now those options are disabled. The crash message: (XEN) d1v0 Unhandled invalid opcode fault/trap [#6, ec=0000] (XEN) domain_crash_sync called from entry.S: fault at ffff82d080218720 entry.o#create_bounce_frame+0x137/0x146 (XEN) Domain 1 (vcpu#0) crashed on cpu#1: (XEN) ----[ Xen-4.8.3 x86_64 debug=n Not tainted ]---- (XEN) CPU: 1 (XEN) RIP: e033:[<ffffffff826d9156>]This is #UD, which is most probably hitting a BUG(). addr2line this ^ to find some code to look at.addr2line failed meBy default, vmlinux is stripped and compressed. Ideally you want to addr2line the vmlinux artefact in the root of your kernel build, which is the plain elf with debugging symbols. Alternatively, use scripts/extract-vmlinux on the binary you actually booted, which might get you somewhere., but System.map says its xen_memory_setup. And it looks like the BUG() is the same as I had in dom0 before: "Xen hypervisor allocated kernel memory conflicts with E820 map".Juergen: Is there anything we can do to try and insert some dummy exception handlers right at PV start, so we could at least print out a oneliner to the host console which is a little more helpful than Xen saying "something unknown went wrong" ?You mean something like commit 42b3a4cb5609de757f5445fcad18945ba9239a07 added to kernel 4.15?Disabling e820_host in guest config solved the problem. Thanks! Is this some bug in Xen or OVMF, or is it expected behavior and e820_host should be avoided?I don't really know. e820_host is a gross hack which shouldn't really be present. The actually problem is that Linux can't cope with the memory layout it was given (and I can't recall if there is anything Linux could potentially to do cope). OTOH, the toolstack, which knew about e820_host and chose to lay the guest out in an overlapping way is probably also at fault.The kernel can cope with lots of E820 scenarios (e.g. by relocating initrd or the p2m map), but moving itself out of the way is not possible.I'm afraid I need to resurrect this thread... With recent kernel (6.6+), the host_e820=0 workaround is not an option anymore. It makes Linux not initialize xen-swiotlb (due to f9a38ea5172a3365f4594335ed5d63e15af2fd18), so PCI passthrough doesn't work at all. While I can add yet another layer of workaround (force xen-swiotlb with iommu=soft), that's getting unwieldy. Furthermore, I don't get the crash message anymore, even with debug hypervisor and guest_loglvl=all. Not even "Domain X crashed" in `xl dmesg`. It looks like the "crash" shutdown reason doesn't reach Xen, and it's considered clean shutdown (I can confirm it by changing various `on_*` settings (via libvirt) and observing which gets applied). Most tests I've done with 6.7-rc1, but the issue I observed on 6.6.1 already. This is on Xen 4.17.2. And the L0 is running Linux 6.6.1, and then uses QEMU 8.1.2 + OVMF 202308 to run Xen as L1.So basically you start the domain and it looks like it's shutting down cleanly from logs. Can you see anything from the guest? Can you turn on some more debugging at guest level?No, it crashes before printing anything to the console, also with earlyprintk=xen.I tried to get some more information from the initial crash but I could not understand which guest code triggered the bug.I'm not sure which one is it this time (because I don't have Xen reporting guest crash...) but last time it was here: https://github.com/torvalds/linux/blob/master/arch/x86/xen/setup.c#L873-L874Hi Marek, I too have run into this "Xen hypervisor allocated kernel memory conflicts with E820 map" error when running Xen under KVM & OVMF with SecureBoot. OVMF built without SecureBoot did not trip over the issue. It was a little while back - I have some notes though. Non-SecureBoot (XEN) [0000000000810000, 00000000008fffff] (ACPI NVS) (XEN) [0000000000900000, 000000007f8eefff] (usable) SecureBoot (XEN) [0000000000810000, 000000000170ffff] (ACPI NVS) (XEN) [0000000001710000, 000000007f0edfff] (usable) Linux (under Xen) is checking that _pa(_text) (= 0x1000000) is RAM, but it is not. Looking at the E820 map, there is type 4, NVS, region defined: [0000000000810000, 000000000170ffff] (ACPI NVS) When OVMF is built with SMM (for SecureBoot) and S3Supported is true, the memory range 0x900000-0x170ffff is additionally marked ACPI NVS and Linux trips over this. It becomes usable RAM under Non-SecureBoot so Linux boots fine. What I don't understand is why there is even a check that _pa(_text) is RAM. Xen logs that it places dom0 way up high in memory, so the physical address of the kernel pages are much higher than 0x1000000. The value 0x1000000 for _pa(_text) doesn't match reality. Maybe there are some expectations for the ACPI NVS and other reserved regions to be 1-1 mapped? I tried removing the BUG mentioned above, but it still failed to boot. I think I also removed a second BUG, but unfortunately I don't have notes on either. The _guest_ physical address is what matters here. With using the host E820 map the PV-kernel tries to rearrange its guest physical memory layout to match the E820 map. And a non-RAM GPA for the location where the kernel is located triggers the BUG. Juergen Attachment:
OpenPGP_0xB0DE9DD628BF132F.asc Attachment:
OpenPGP_signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |