[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CONFIG_XEN_VIRTIO{_FORCE_GRANT} interferes with nested virt



On 05.10.22 15:51, Marek Marczykowski-Górecki wrote:
On Wed, Oct 05, 2022 at 03:34:56PM +0200, Juergen Gross wrote:
On 05.10.22 15:25, Marek Marczykowski-Górecki wrote:
On Wed, Oct 05, 2022 at 02:57:01PM +0200, Juergen Gross wrote:
On 05.10.22 14:41, Marek Marczykowski-Górecki wrote:
Hi,

When booting Xen with Linux dom0 nested under KVM,
CONFIG_XEN_VIRTIO_FORCE_GRANT=y makes it unable to use virtio devices
provided by L0 hypervisor (KVM with qemu). With PV dom0, grants are
required for virtio even if just CONFIG_XEN_VIRTIO is enabled.

This is probably uncommon corner case, but one that has bitten me in my
CI setup... I think Xen should set smarter
virtio_require_restricted_mem_acc(), that enforces it only for devices
really provided by another Xen VM (not by the "outer host"), but I'm not
sure how that could be done. Any ideas?


It should be possible to add a boot parameter for that purpose. Using it
would open a security hole, though (basically like all PCI passthrough to
PV guests).

What about excluding just dom0? At least currently, there is no way for
dom0 to see virtio devices provided by another Xen domU.

Even not via hotplug?

That's why I said "currently", IIUC hotplug of virtio devices under Xen
doesn't work yet, no?
With hotplug working, it would need to be a proper detection where the
backend lives, and probably with some extra considerations re Xen on
Xen (based on below, pv-shim could use grants).

As stated before, this isn't a problem specific to virtio devices. The same
applies to Xen PV devices.


For me specifically, a command line option would work (because I don't
use Xen-based virtio devices when nested under KVM, or anywhere at all,
at least not yet), but I can see future cases where you have virtio
devices from both L0 and L1 in the same guest, and then it wouldn't be
that simple.

Lets think of a general solution covering all PV devices (Xen and virtio).


Something like this:
---8<---
diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
index 9b1a58dda935..6ac32b0b3720 100644
--- a/arch/x86/xen/enlighten_pv.c
+++ b/arch/x86/xen/enlighten_pv.c
@@ -111,7 +111,7 @@ static DEFINE_PER_CPU(struct tls_descs, shadow_tls_desc);
   static void __init xen_pv_init_platform(void)
   {
          /* PV guests can't operate virtio devices without grants. */
-       if (IS_ENABLED(CONFIG_XEN_VIRTIO))
+       if (IS_ENABLED(CONFIG_XEN_VIRTIO) && !xen_initial_domain())
                  virtio_set_mem_acc_cb(virtio_require_restricted_mem_acc);
          populate_extra_pte(fix_to_virt(FIX_PARAVIRT_BOOTMAP));
---8<---

This BTW raises also a question what will happen on Xen nested inside
Xen, when L0 will provide virtio devices to L1. Grants set by L1 dom0
wouldn't work on L0, no? Or maybe this is solved already for pv-shim
case?

This is a similar problem as with normal Xen PV devices.

You will need either a simple grant passthrough like with pv-shim (enabling
such devices for one guest in L1 only), or you need a grant multiplexer in
L1 Xen in case you want to be able to have multiple guests in L1 being able
to
use L0 PV devices.

This will be tricky, at least with the current frontend drivers.
Frontend kernel is in charge of assigning grant refs, _and_
communicating them to the backend. Such multiplexer would need to
intercept one or the other (either translate, or ensure they are
allocated from distinct ranges to begin with). I don't see how that
could be done with the current domU kernels. That might be better with
your idea of multiple grant v3 trees, where the hypervisor might dictate
grant ranges.

Yes, this is another advantage of the V3 approach I haven't thought of
before.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.