[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why memory lending is needed for GPU acceleration


  • To: Teddy Astie <teddy.astie@xxxxxxxxxx>, Demi Marie Obenour <demiobenour@xxxxxxxxx>, Xen developer discussion <xen-devel@xxxxxxxxxxxxxxxxxxxx>, dri-devel@xxxxxxxxxxxxxxxxxxxxx, linux-mm@xxxxxxxxx, Ariadne Conill <ariadne@ariadne.space>
  • From: Val Packett <val@xxxxxxxxxxxxxxxxxxxxxx>
  • Date: Tue, 31 Mar 2026 08:23:14 -0300
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=fm1 header.d=invisiblethingslab.com header.i="@invisiblethingslab.com" header.h="Content-Transfer-Encoding:Content-Type:Date:From:In-Reply-To:Message-ID:MIME-Version:References:Subject:To"; dkim=pass header.s=fm2 header.d=messagingengine.com header.i="@messagingengine.com" header.h="Content-Transfer-Encoding:Content-Type:Date:Feedback-ID:From:In-Reply-To:Message-ID:MIME-Version:References:Subject:To:X-ME-Proxy:X-ME-Sender"
  • Autocrypt: addr=val@xxxxxxxxxxxxxxxxxxxxxx; keydata= xm8EaFTEiRMFK4EEACIDAwQ+qzawvLuE95iu+QkRqp8P9z6XvFopWtYOaEnYf/nE8KWCnsCD jz82tdbKBpmVOdR6ViLD9tzHvaZ1NqZ9mbrszMXq09VfefoCfZp8jnA2yCT8Y4ykmv6902Ne NnlkVwrNKFZhbCBQYWNrZXR0IDx2YWxAaW52aXNpYmxldGhpbmdzbGFiLmNvbT7CswQTEwkA OxYhBAFMrro+oMGIFPc7Uc87uZxqzalRBQJoVMSJAhsDBQsJCAcCAiICBhUKCQgLAgQWAgMB Ah4HAheAAAoJEM87uZxqzalRlIIBf0cujzfSLhvib9iY8LBh8Tirgypm+hJHoY563xhP0YRS pmqZ6goIuSGpEKcW5mV3egF/TLLAOjsfroWae4giImTVOJvLOsUycxAP4O5b1Qiy+cCGsHKA nCRzrvqnPkyf4OeRznMEaFTEiRIFK4EEACIDAwSffe3tlMmmg3eKVp7SJ+CNZLN0M5qzHSCV dBBkIVvEJo+8SDg4jrx/832rxpvMCz2+x7+OHaeBHKafhOWUccYBLKqV/3nBftxCkbzXDbfY d02BY9H4wBIn0Y3GnwoIXRgDAQkJwpgEGBMJACAWIQQBTK66PqDBiBT3O1HPO7mcas2pUQUC aFTEiQIbDAAKCRDPO7mcas2pUaptAX9f7yUJLGU4C6XjMJvXd8Sz6cGTyxkngPtUyFiNqtad /GXBi3vHKYNfSrdqJ8wmZ8MBgOqWaaa1wE4/3qZU8d4RNR8mF7O40WYK/wdf1ycq1uGad8PN UDOwAqdfvuF3w8QMPw==
  • Delivery-date: Tue, 31 Mar 2026 11:23:35 +0000
  • Feedback-id: i001e48d0:Fastmail
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 3/31/26 6:42 AM, Teddy Astie wrote:
Le 30/03/2026 à 22:13, Val Packett a écrit :
[..]

we have no need to replicate what KVM does. That's far from the only
thing that can be done with a dmabuf.

The import-export machinery on the other hand actually does pin the
buffers on the driver level, importers are not obligated to support
movable buffers (move_notify in dma_buf_attach_ops is entirely optional).

dma-buf is by concept non-movable if actively used (otherwise, it would
break DMA). It's just a foreign buffer, and from device standpoint, just
plain RAM that needs to be mapped.

Interestingly, there is already XEN_GNTDEV_DMABUF…

Wait, do we even have any reason at all to suspect
that XEN_GNTDEV_DMABUF doesn't already satisfy all of our buffer-sharing
requirements?

XEN_GNTDEV_DMABUF has been designed for GPU use-cases, and more
precisely for paravirtualizing a display. The only issue I would have
with it is that grants are not scalable for GPU 3D use cases (with
hundreds of MB to share).

At least for the Qubes side, we aren't aiming at running Crysis on a paravirtualized GPU just yet anyway :) First we just want desktop apps to run well.

Keep in mind that with virtgpu paravirtualization, actual buffer sharing between domains only happens for CPU access, which is mostly used for:

- initial resource uploads;
- the occasional readback (which is inherently slow and all graphics devs try not to *ever* do);
- special cases like screen capture.

Most CPU mappings of GPU driver managed buffers live for the duration of a single memcpy. Mapping size can get large for games indeed, but for desktop applications it's rather small.

On the rendering hot path the guest virtgpu driver just submits jobs that refer to abstract handles managed by virglrenderer on the host, and buffer sharing is *not* happening.

But we can still keep the concept of a structured guest-owned memory
that is shared with Dom0 (but for larger quantities), I have some ideas
regarding improving that area in Xen.

The only issue with changing the memory sharing model is that you would
need to adjust the virtio-gpu aspect, but the rest can stay the same.

The biggest concern regarding driver compatibility is more about :
- can dma-buf be used as general buffers : probably yes (even with
OpenGL/Vulkan); exception may be proprietary Nvidia drivers that lacks
the feature; maybe very old hardware may struggle more with it
Current nvidia blob drivers do not lack the feature btw..
- can guest UMD work without access to vram : yes (apparently), AMDGPU
has a special case where VRAM is not visible (e.g too small PCI BAR),
there is vram size vs "vram visible size" (which could be 0); you could
fallback vram-guest-visible with ram mapped on device

UMDs work on a higher level, they work on buffers which are managed by the KMD.

In any paravirtualization situation (whether "native contexts"/vDRM which runs the full HW-specific UMD in the guest, or API-forwarding solutions like Venus) the only guest KMD is virtio-gpu! The guest kernel isn't really aware of what VRAM even is.

https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/amd/common/virtio/amdgpu_virtio_bo.c

^ this 300-ish-line file is everything amdgpu ever does with buffer objects on the virtio backend.

All it can do is manage host handles, import guest dmabufs into virtgpu to get handles for them, export handles to get guest dmabufs, and map handles for guest CPU access via the VIRTGPU_MAP ioctl. There are no special details to any of this, it's all very straightforward.

It seems to me that implementing VIRTGPU_MAP in terms of dmabuf grants would be easy!..

I'll need to get to that point first though, right now I'm still working on making basic virtio itself work in our (x86) situation.

- can it be defined in Vulkan terms (from driver) : You can have
device_local memory without having it host-visible (i.e memory exists,
but can't be added in the guest). You would probably just lose some
zero-copy paths with VRAM. Though you still have RAM shared with GPU
(GTT in AMDGPU) if that matters.

What did you mean by "added" in the guest?

We shouldn't ever have to touch this level at all, anyhow…
Worth noting that if you're on integration graphics, you don't have VRAM
and everything is RAM anyway.


Thanks,
~val




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.