Xen project Mailing List

Re: [Xen-users] Why is GPU passthrough so difficult?

To: "Kent R. Spillner" <kspillner@xxxxxxx>

From: Stephen Oberholtzer <stevie@xxxxxxxxx>

Date: Fri, 26 May 2017 20:51:07 -0400

Delivery-date: Sat, 27 May 2017 00:52:21 +0000

List-id: Xen user discussion <xen-users.lists.xen.org>

First, I disagree with your general implication that GPU passthrough is difficult. I am by no means a Xen expert, and it did not take me much effort to set up a Xen machine that runs a Windows HVM (originally 8.1, now 10) with GPU passthrough. I use it to play games. (A lot of games. Probably more than I ought to.) I will grant, however, that a *lot* of little pieces that need to come together for it to work: 1. It's not going to work with an nVidia card. That's firmly in the "GPU vendors trying to upsell more enterprise hardware" camp; as I understand it, their drivers specifically detect virtualization and refuse to cooperate. 2. There is an issue with ATI/AMD cards; they do not support an important feature called FLR - Function Level Reset. The result is that you can't tell the card "act like you've just been rebooted", which means the video card does not present itself to the virtual machine's boot-up code properly. The upshot to this is: the VM must believe that the ATI/AMD card is a secondary graphics card, and it must boot off of the virtual Cirrus card that qemu provides.. This isn't so bad; I just tell Windows not to use the virtual Cirrus card. 3. IOMMU fun! This is probably the most troublesome aspect, because there are a lot of pieces to get right, and several of them are under the complete control of non-experts. If it doesn't work, it's pretty much your motherboard manufacturer's fault. Here's the deal: in HVM mode, page frames (aka 'physical memory locations') are virtualized. So the VM thinks it's controlling frames 0, 1, 2, 3, etc. but really those are mapped to, say, frames 47, 52, 93, and 107. This all works out, because the all of the VM's memory accesses are made by the CPU, and the CPU is in on the conspiracy to fake out the VM's OS. But hardware devices (including video cards) can use DMA directly transfer information between onboard RAM and the RAM in the motherboard. And when Windows (which is directly communicating with the GPU, so Xen doesn't get to intercept it) says to the video card "okay the next screen to display is in frame 3" the graphics card copies frame 3, but the real data is in frame 107. Result: Random garbage from some other VM (or maybe Xen itself) is used as display data for the screen. Or, even worse, the graphics card *writes* to frame 3, crashing that other VM (or, again, Xen itself). Either way, it's not gonna work. To solve this problem, you need an IOMMU. An IOMMU brings the DMA controller into the conspiracy. Then, Xen can then program the IOMMU to say "when the GPU wants to read/write frame 3, secretly divert it to frame 107". For Intel, their "VT-d" functionality available in the Core i7 means that the memory controller there has an IOMMU. I have no idea what the rules are for AMD chips. This is the part where things start to break down. Because a lot of mobo mfrs like to cheat in order to get better numbers (higher performance, more GPU slots, or both). The CPU has a limited number of PCIe lanes. If you want more lanes (for example, if you want to stick more than one x16 slot on a board) you'll need a PCIe-PCIe bridge. That's where things tend to go wrong: Most of those bridges are *not* in on the IOMMU conspiracy. As a result, you can only pass-through the *entire bridge* (i.e. all of the PCIe devices, or at least all of the GPUs) to a single VM. My understanding from the research I did when I set this up several years ago: * The "nForce" chipset bypasses the IOMMU somehow, making it completely useless for GPU passthrough. * If you want to do passthrough on a mobo that has more than one x16 slot, look for a mobo that uses a PLX PEX 8747 chip for its PCIe-PCIe bridge. Note that these are going to be *expensive*, because those chips are something like $40. I'm not saying that "if it's expensive, then it will work". I'm saying that if you want one that works, it is going to be expensive. My setup uses an ASRock H97M Pro4. From what I've been able to tell, ASRock motherboards have the best support for PCI passthrough. 4. BIOS fun! The BIOS has to correctly set up a whole lot of things that I don't understand, in order for all to come together. Ironically, I had less trouble with the graphics card than I did with getting USB to pass-through. I tried passing through 3 different PCIe USB cards and none of them would work; I still don't know why. On Fri, May 26, 2017 at 3:54 PM, Kent R. Spillner <kspillner@xxxxxxx> wrote: > Why is GPU passthrough so difficult? I saw a note on the wiki about GPU > passthrough which mentioned the fact that modern video cards perform a lot of > different functions and maintain a lot of state, but I couldn't find any > details about why GPU passthrough is more difficult compared to other PCI > devices. > > It seems that even with two separate GPUs it's still tricky to get things > working. Are the main challenges in the drivers, or in the kernel, or in the > way BIOS/UEFI initializes video cards, or some combination of all of these > things? Or, dare I ask, is it the GPU vendors trying to upsell more > "enterprise" hardware? > > I'm just curious to learn more about the underlying technical challenges > around GPU passthrough in general. Thanks in advance for any explanation or > pointers! > > > > _______________________________________________ > Xen-users mailing list > Xen-users@xxxxxxxxxxxxx > https://lists.xen.org/xen-users -- -- Stevie-O Real programmers use COPY CON PROGRAM.EXE _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx https://lists.xen.org/xen-users

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.