[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH] xen: privcmd: fix ioeventfd/ioreq crashing PV domain


  • To: Jürgen Groß <jgross@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Oleksandr Tyshchenko <oleksandr_tyshchenko@xxxxxxxx>
  • From: Val Packett <val@xxxxxxxxxxxxxxxxxxxxxx>
  • Date: Tue, 4 Nov 2025 22:16:44 -0300
  • Autocrypt: addr=val@xxxxxxxxxxxxxxxxxxxxxx; keydata= xm8EaFTEiRMFK4EEACIDAwQ+qzawvLuE95iu+QkRqp8P9z6XvFopWtYOaEnYf/nE8KWCnsCD jz82tdbKBpmVOdR6ViLD9tzHvaZ1NqZ9mbrszMXq09VfefoCfZp8jnA2yCT8Y4ykmv6902Ne NnlkVwrNKFZhbCBQYWNrZXR0IDx2YWxAaW52aXNpYmxldGhpbmdzbGFiLmNvbT7CswQTEwkA OxYhBAFMrro+oMGIFPc7Uc87uZxqzalRBQJoVMSJAhsDBQsJCAcCAiICBhUKCQgLAgQWAgMB Ah4HAheAAAoJEM87uZxqzalRlIIBf0cujzfSLhvib9iY8LBh8Tirgypm+hJHoY563xhP0YRS pmqZ6goIuSGpEKcW5mV3egF/TLLAOjsfroWae4giImTVOJvLOsUycxAP4O5b1Qiy+cCGsHKA nCRzrvqnPkyf4OeRznMEaFTEiRIFK4EEACIDAwSffe3tlMmmg3eKVp7SJ+CNZLN0M5qzHSCV dBBkIVvEJo+8SDg4jrx/832rxpvMCz2+x7+OHaeBHKafhOWUccYBLKqV/3nBftxCkbzXDbfY d02BY9H4wBIn0Y3GnwoIXRgDAQkJwpgEGBMJACAWIQQBTK66PqDBiBT3O1HPO7mcas2pUQUC aFTEiQIbDAAKCRDPO7mcas2pUaptAX9f7yUJLGU4C6XjMJvXd8Sz6cGTyxkngPtUyFiNqtad /GXBi3vHKYNfSrdqJ8wmZ8MBgOqWaaa1wE4/3qZU8d4RNR8mF7O40WYK/wdf1ycq1uGad8PN UDOwAqdfvuF3w8QMPw==
  • Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx
  • Delivery-date: Wed, 05 Nov 2025 01:17:09 +0000
  • Feedback-id: i001e48d0:Fastmail
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 11/4/25 9:15 AM, Jürgen Groß wrote:
On 15.10.25 21:57, Val Packett wrote:
Starting a virtio backend in a PV domain would panic the kernel in
alloc_ioreq, trying to dereference vma->vm_private_data as a pages
pointer when in reality it stayed as PRIV_VMA_LOCKED.

Fix by allocating a pages array in mmap_resource in the PV case,
filling it with page info converted from the pfn array. This allows
ioreq to function successfully with a backend provided by a PV dom0.

Signed-off-by: Val Packett <val@xxxxxxxxxxxxxxxxxxxxxx>
---
I've been porting the xen-vhost-frontend[1] to Qubes, which runs on amd64 and we (still) use PV for dom0. The x86 part didn't give me much trouble, but the first thing I found was this crash due to using a PV domain to host the backend. alloc_ioreq was dereferencing the '1' constant and panicking
the dom0 kernel.

I figured out that I can make a pages array in the expected format from the pfn array where the actual memory mapping happens for the PV case, and with
the fix, the ioreq part works: the vhost frontend replies to the probing
sequence and the guest recognizes which virtio device is being provided.

I still have another thing to debug: the MMIO accesses from the inner driver (e.g. virtio_rng) don't get through to the vhost provider (ioeventfd does
not get notified), and manually kicking the eventfd from the frontend
seems to crash... Xen itself?? (no Linux panic on console, just a freeze and
quick reboot - will try to set up a serial console now)

IMHO for making the MMIO accesses work you'd need to implement ioreq-server support for PV-domains in the hypervisor. This will be a major endeavor, so
before taking your Linux kernel patch I'd like to see this covered.

Sorry, I wasn't clear enough.. it's *not* that MMIO accesses don't work.

I debugged this a bit more, and it turns out:

1. the reason why "ioeventfd does not get notified" is because accessing the virtio page (allocated with this privcmd interface) from the kernel was failing. The exchange between the guest driver and the userspace ioreq server has been working perfectly, but the *kernel* access (which is what needs this `struct page` allocation with the current code) was returning nonsense and the check for the virtqueue readiness flag was failing.

I have noticed and fixed (locally) a bug in this patch: reusing the `pfns` allocation for `errs` in `xen_remap_domain_mfn_array` meant that the actual pfn value was overwritten with a zero ("success" error code), and that's the `pfn` I was using.

Still, the memory visible in the dom0 kernel at that pfn is not the same allocation that's mapped into the process. Instead, it's some random other memory. I've added a hexdump for it in the ioeventfd notifier and it was returning random stuff from other userspace programs such as "// SPDX-License-Identifier" from a text editor (haha). Actually, *once* it did just work and I've managed to attach a virtio-rng driver and have it fully work.

Clearly I'm just struggling with the way memory mappings work under PV. Do I need to specifically create a second mapping for the kernel using the same `xen_remap_domain_mfn_array` call?

2. the reason why "manually kicking the eventfd from the frontend seems to crash... Xen itself" was actually because that triggered the guest interrupt and I was using the ISA interrupts that required the virtual (IO)APIC to exist, and it doesn't in PVH domains. For now I switched my test setup to HVM to get around that, but I'd need to.. figure out a virq/pirq type setup to route XEN_DMOP_set_isa_irq_level calls over event channels for PV(H) guests.

But I figured I'd post this as an RFC already, since the other bug may be
unrelated and the ioreq area itself does work now. I'd like to hear some
feedback on this from people who actually know Xen :)

My main problem with your patch is that it is adding a memory allocation
for a very rare use case impacting all current users of that functionality.

You could avoid that by using a different ioctl which could be selected by
specifying a new flag when calling xenforeignmemory_open() (have a look
into the Xen sources under tools/libs/foreignmemory/core.c).

Right, that could be solved. Having userspace choose based on what kind of domain it is sounds a bit painful (you're talking about C libraries and I'm using independent Rust ones, so this logic would have to be present in multiple places).. But this kernel code could be refactored more.

We don't actually need any `struct page` specifically, `ioeventfd_interrupt` only really needs a kernel pointer to the actual ioreq memory we're allocating here.

I'm mostly just asking for help to figure out how to get that pointer.


Thanks,
~val

Attachment: OpenPGP_0xCF3BB99C6ACDA951.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.