[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CONFIG_XEN_VIRTIO{_FORCE_GRANT} interferes with nested virt



On 05.10.22 17:35, Marek Marczykowski-Górecki wrote:
On Wed, Oct 05, 2022 at 05:04:29PM +0200, Juergen Gross wrote:
On 05.10.22 15:51, Marek Marczykowski-Górecki wrote:
On Wed, Oct 05, 2022 at 03:34:56PM +0200, Juergen Gross wrote:
On 05.10.22 15:25, Marek Marczykowski-Górecki wrote:
On Wed, Oct 05, 2022 at 02:57:01PM +0200, Juergen Gross wrote:
On 05.10.22 14:41, Marek Marczykowski-Górecki wrote:
Hi,

When booting Xen with Linux dom0 nested under KVM,
CONFIG_XEN_VIRTIO_FORCE_GRANT=y makes it unable to use virtio devices
provided by L0 hypervisor (KVM with qemu). With PV dom0, grants are
required for virtio even if just CONFIG_XEN_VIRTIO is enabled.

This is probably uncommon corner case, but one that has bitten me in my
CI setup... I think Xen should set smarter
virtio_require_restricted_mem_acc(), that enforces it only for devices
really provided by another Xen VM (not by the "outer host"), but I'm not
sure how that could be done. Any ideas?


It should be possible to add a boot parameter for that purpose. Using it
would open a security hole, though (basically like all PCI passthrough to
PV guests).

What about excluding just dom0? At least currently, there is no way for
dom0 to see virtio devices provided by another Xen domU.

Even not via hotplug?

That's why I said "currently", IIUC hotplug of virtio devices under Xen
doesn't work yet, no?
With hotplug working, it would need to be a proper detection where the
backend lives, and probably with some extra considerations re Xen on
Xen (based on below, pv-shim could use grants).

As stated before, this isn't a problem specific to virtio devices. The same
applies to Xen PV devices.

Why is that an issue for Xen PV devices? They always use grants, so no need
for exception. But more relevant here, there is no protocol for L0
hypervisor (that would need to be Xen) to provide PV device to nested L1
guest (besides pv-shim case, which is already handled), so L1 guest
cannot confuse PV device provided by L0 and L1.

That's the point. Today using virtio the way you are using it is possible
only because virtio devices don't have the security features of Xen PV
devices. With adding grant support for virtio devices this has changed.

BTW, you can have the old virtio behavior back by not enabling
CONFIG_XEN_VIRTIO.


For me specifically, a command line option would work (because I don't
use Xen-based virtio devices when nested under KVM, or anywhere at all,
at least not yet), but I can see future cases where you have virtio
devices from both L0 and L1 in the same guest, and then it wouldn't be
that simple.

Lets think of a general solution covering all PV devices (Xen and virtio).

In fact, I wonder what's the security benefit of
CONFIG_XEN_VIRTIO_FORCE_GRANT. If the backend lives in dom0 (or
stubdomain), it can access whole guest memory anyway, whether frontend
likes it or not. But if the backend is elsewhere (or guest is protected
with AMD SEV-SNP, XSM or similar), then the backend won't be able to access
memory outside of what frontend shares explicitly. So, in the non-dom0 case,
backend trying to provide non-grant-based virtio device will simply not
function (because of inability to access guest's memory), instead of
gaining unintended access. Am I missing some implicit memory sharing
here?

You are missing the possibility to have a deprivileged user land virtio
backend.

And BTW, SEV won't disable guest memory access, it will just make it
impossible to interprete memory contents from outside. A malicious
backend can still easily crash a SEV guest by clobbering its memory.


Juergen

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.