[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why memory lending is needed for GPU acceleration

To: Teddy Astie <teddy.astie@xxxxxxxxxx>, Demi Marie Obenour <demiobenour@xxxxxxxxx>, Xen developer discussion <xen-devel@xxxxxxxxxxxxxxxxxxxx>, dri-devel@xxxxxxxxxxxxxxxxxxxxx, linux-mm@xxxxxxxxx, Ariadne Conill <ariadne@ariadne.space>
From: Val Packett <val@xxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 31 Mar 2026 08:23:14 -0300
Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=fm1 header.d=invisiblethingslab.com header.i="@invisiblethingslab.com" header.h="Content-Transfer-Encoding:Content-Type:Date:From:In-Reply-To:Message-ID:MIME-Version:References:Subject:To"; dkim=pass header.s=fm2 header.d=messagingengine.com header.i="@messagingengine.com" header.h="Content-Transfer-Encoding:Content-Type:Date:Feedback-ID:From:In-Reply-To:Message-ID:MIME-Version:References:Subject:To:X-ME-Proxy:X-ME-Sender"
Autocrypt: addr=val@xxxxxxxxxxxxxxxxxxxxxx; keydata= xm8EaFTEiRMFK4EEACIDAwQ+qzawvLuE95iu+QkRqp8P9z6XvFopWtYOaEnYf/nE8KWCnsCD jz82tdbKBpmVOdR6ViLD9tzHvaZ1NqZ9mbrszMXq09VfefoCfZp8jnA2yCT8Y4ykmv6902Ne NnlkVwrNKFZhbCBQYWNrZXR0IDx2YWxAaW52aXNpYmxldGhpbmdzbGFiLmNvbT7CswQTEwkA OxYhBAFMrro+oMGIFPc7Uc87uZxqzalRBQJoVMSJAhsDBQsJCAcCAiICBhUKCQgLAgQWAgMB Ah4HAheAAAoJEM87uZxqzalRlIIBf0cujzfSLhvib9iY8LBh8Tirgypm+hJHoY563xhP0YRS pmqZ6goIuSGpEKcW5mV3egF/TLLAOjsfroWae4giImTVOJvLOsUycxAP4O5b1Qiy+cCGsHKA nCRzrvqnPkyf4OeRznMEaFTEiRIFK4EEACIDAwSffe3tlMmmg3eKVp7SJ+CNZLN0M5qzHSCV dBBkIVvEJo+8SDg4jrx/832rxpvMCz2+x7+OHaeBHKafhOWUccYBLKqV/3nBftxCkbzXDbfY d02BY9H4wBIn0Y3GnwoIXRgDAQkJwpgEGBMJACAWIQQBTK66PqDBiBT3O1HPO7mcas2pUQUC aFTEiQIbDAAKCRDPO7mcas2pUaptAX9f7yUJLGU4C6XjMJvXd8Sz6cGTyxkngPtUyFiNqtad /GXBi3vHKYNfSrdqJ8wmZ8MBgOqWaaa1wE4/3qZU8d4RNR8mF7O40WYK/wdf1ycq1uGad8PN UDOwAqdfvuF3w8QMPw==
Delivery-date: Tue, 31 Mar 2026 11:23:35 +0000
Feedback-id: i001e48d0:Fastmail
List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 3/31/26 6:42 AM, Teddy Astie wrote:

Le 30/03/2026 à 22:13, Val Packett a écrit :

[..]

we have no need to replicate what KVM does. That's far from the only
thing that can be done with a dmabuf.

The import-export machinery on the other hand actually does pin the
buffers on the driver level, importers are not obligated to support
movable buffers (move_notify in dma_buf_attach_ops is entirely optional).

dma-buf is by concept non-movable if actively used (otherwise, it would
break DMA). It's just a foreign buffer, and from device standpoint, just
plain RAM that needs to be mapped.

Interestingly, there is already XEN_GNTDEV_DMABUF…

Wait, do we even have any reason at all to suspect
that XEN_GNTDEV_DMABUF doesn't already satisfy all of our buffer-sharing
requirements?

XEN_GNTDEV_DMABUF has been designed for GPU use-cases, and more
precisely for paravirtualizing a display. The only issue I would have
with it is that grants are not scalable for GPU 3D use cases (with
hundreds of MB to share).

At least for the Qubes side, we aren't aiming at running Crysis on aparavirtualized GPU just yet anyway :) First we just want desktop appsto run well.

Keep in mind that with virtgpu paravirtualization, actual buffer sharingbetween domains only happens for CPU access, which is mostly used for:


- initial resource uploads;

- the occasional readback (which is inherently slow and all graphicsdevs try not to *ever* do);

- special cases like screen capture.

Most CPU mappings of GPU driver managed buffers live for the duration ofa single memcpy. Mapping size can get large for games indeed, but fordesktop applications it's rather small.

On the rendering hot path the guest virtgpu driver just submits jobsthat refer to abstract handles managed by virglrenderer on the host, andbuffer sharing is *not* happening.

But we can still keep the concept of a structured guest-owned memory
that is shared with Dom0 (but for larger quantities), I have some ideas
regarding improving that area in Xen.

The only issue with changing the memory sharing model is that you would
need to adjust the virtio-gpu aspect, but the rest can stay the same.

The biggest concern regarding driver compatibility is more about :
- can dma-buf be used as general buffers : probably yes (even with
OpenGL/Vulkan); exception may be proprietary Nvidia drivers that lacks
the feature; maybe very old hardware may struggle more with it

Current nvidia blob drivers do not lack the feature btw..

- can guest UMD work without access to vram : yes (apparently), AMDGPU
has a special case where VRAM is not visible (e.g too small PCI BAR),
there is vram size vs "vram visible size" (which could be 0); you could
fallback vram-guest-visible with ram mapped on device

UMDs work on a higher level, they work on buffers which are managed bythe KMD.

In any paravirtualization situation (whether "nativecontexts"/vDRM which runs the full HW-specific UMD in the guest, orAPI-forwarding solutions like Venus) the only guest KMD is virtio-gpu!The guest kernel isn't really aware of what VRAM even is.


https://gitlab.freedesktop.org/mesa/mesa/-/blob/main/src/amd/common/virtio/amdgpu_virtio_bo.c

^ this 300-ish-line file is everything amdgpu ever does with bufferobjects on the virtio backend.

All it can do is manage host handles, import guest dmabufs into virtgputo get handles for them, export handles to get guest dmabufs, and maphandles for guest CPU access via the VIRTGPU_MAP ioctl. There are nospecial details to any of this, it's all very straightforward.

It seems to me that implementing VIRTGPU_MAP in terms of dmabuf grantswould be easy!..

I'll need to get to that point first though, right now I'm still workingon making basic virtio itself work in our (x86) situation.

- can it be defined in Vulkan terms (from driver) : You can have
device_local memory without having it host-visible (i.e memory exists,
but can't be added in the guest). You would probably just lose some
zero-copy paths with VRAM. Though you still have RAM shared with GPU
(GTT in AMDGPU) if that matters.


What did you mean by "added" in the guest?

We shouldn't ever have to touch this level at all, anyhow…

Worth noting that if you're on integration graphics, you don't have VRAM
and everything is RAM anyway.



Thanks,
~val

References:
- Mapping non-pinned memory from one Xen domain into another
  - From: Demi Marie Obenour
- Why memory lending is needed for GPU acceleration
  - From: Demi Marie Obenour
- Re: Why memory lending is needed for GPU acceleration
  - From: Val Packett
- Re: Why memory lending is needed for GPU acceleration
  - From: Teddy Astie

Prev by Date: Re: [PATCH v2 3/8] vpci: Use pervcpu ranges for BAR mapping
Next by Date: Re: [PATCH v2 3/8] vpci: Use pervcpu ranges for BAR mapping
Previous by thread: Re: Why memory lending is needed for GPU acceleration
Next by thread: Re: Mapping non-pinned memory from one Xen domain into another
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.