[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Why is GPU passthrough so difficult?

First, I disagree with your general implication that GPU passthrough
is difficult.  I am by no means a Xen expert, and it did not take me
much effort to set up a Xen machine that runs a Windows HVM
(originally 8.1, now 10) with GPU passthrough.  I use it to play
games. (A lot of games. Probably more than I ought to.)

I will grant, however, that a *lot* of little pieces that need to come
together for it to work:

1. It's not going to work with an nVidia card. That's firmly in the
"GPU vendors trying to upsell more enterprise hardware" camp; as I
understand it, their drivers specifically detect virtualization and
refuse to cooperate.

2. There is an issue with ATI/AMD cards; they do not support an
important feature called FLR - Function Level Reset.  The result is
that you can't tell the card "act like you've just been rebooted",
which means the video card does not present itself to the virtual
machine's boot-up code properly.  The upshot to this is: the VM must
believe that the ATI/AMD card is a secondary graphics card, and it
must boot off of the virtual Cirrus card that qemu provides.. This
isn't so bad; I just tell Windows not to use the virtual Cirrus card.

3. IOMMU fun!  This is probably the most troublesome aspect, because
there are a lot of pieces to get right, and several of them are under
the complete control of non-experts. If it doesn't work, it's pretty
much your motherboard manufacturer's fault.

Here's the deal: in HVM mode, page frames (aka 'physical memory
locations') are virtualized.  So the VM thinks it's controlling frames
0, 1, 2, 3, etc. but really those are mapped to, say, frames 47, 52,
93, and 107.  This all works out, because the all of the VM's memory
accesses are made by the CPU, and the CPU is in on the conspiracy to
fake out the VM's OS.

But hardware devices (including video cards) can use DMA directly
transfer information between onboard RAM and the RAM in the
motherboard.  And when Windows (which is directly communicating with
the GPU, so Xen doesn't get to intercept it) says to the video card
"okay the next screen to display is in frame 3" the graphics card
copies frame 3, but the real data is in frame 107.  Result: Random
garbage from some other VM (or maybe Xen itself) is used as display
data for the screen. Or, even worse, the graphics card *writes* to
frame 3, crashing that other VM (or, again, Xen itself). Either way,
it's not gonna work.

To solve this problem, you need an IOMMU.  An IOMMU brings the DMA
controller into the conspiracy. Then, Xen can then program the IOMMU
to say "when the GPU wants to read/write frame 3, secretly divert it
to frame 107".

For Intel, their "VT-d" functionality available in the Core i7 means
that the memory controller there has an IOMMU. I have no idea what the
rules are for AMD chips.

This is the part where things start to break down.  Because a lot of
mobo mfrs like to cheat in order to get better numbers (higher
performance, more GPU slots, or both).
The CPU has a limited number of PCIe lanes.  If you want more lanes
(for example, if you want to stick more than one x16 slot on a board)
you'll need a PCIe-PCIe bridge.

That's where things tend to go wrong: Most of those bridges are *not*
in on the IOMMU conspiracy. As a result, you can only pass-through the
*entire bridge* (i.e. all of the PCIe devices, or at least all of the
GPUs) to a single VM.

My understanding from the research I did when I set this up several years ago:

* The "nForce" chipset bypasses the IOMMU somehow, making it
completely useless for GPU passthrough.
* If you want to do passthrough on a mobo that has more than one x16
slot, look for a mobo that uses a PLX PEX 8747 chip for its PCIe-PCIe
bridge.   Note that these are going to be *expensive*, because those
chips are something like $40.  I'm not saying that "if it's expensive,
then it will work". I'm saying that if you want one that works, it is
going to be expensive.

My setup uses an ASRock H97M Pro4.  From what I've been able to tell,
ASRock motherboards have the best support for PCI passthrough.

4. BIOS fun! The BIOS has to correctly set up a whole lot of things
that I don't understand, in order for all to come together.

Ironically, I had less trouble with the graphics card than I did with
getting USB to pass-through.  I tried passing through 3 different PCIe
USB cards and none of them would work; I still don't know why.

On Fri, May 26, 2017 at 3:54 PM, Kent R. Spillner <kspillner@xxxxxxx> wrote:
> Why is GPU passthrough so difficult?  I saw a note on the wiki about GPU 
> passthrough which mentioned the fact that modern video cards perform a lot of 
> different functions and maintain a lot of state, but I couldn't find any 
> details about why GPU passthrough is more difficult compared to other PCI 
> devices.
> It seems that even with two separate GPUs it's still tricky to get things 
> working.  Are the main challenges in the drivers, or in the kernel, or in the 
> way BIOS/UEFI initializes video cards, or some combination of all of these 
> things?  Or, dare I ask, is it the GPU vendors trying to upsell more 
> "enterprise" hardware?
> I'm just curious to learn more about the underlying technical challenges 
> around GPU passthrough in general.  Thanks in advance for any explanation or 
> pointers!
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxx
> https://lists.xen.org/xen-users

-- Stevie-O
Real programmers use COPY CON PROGRAM.EXE

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.