[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Getting ready for a computer build for IOMMU virtualization, need some input regarding sharing multiple GPUs among VMs

  • To: Gordan Bobic <gordan@xxxxxxxxxx>, Zir Blazer <zir_blazer@xxxxxxxxxxx>
  • From: "H. Sieger" <powerhouse.linux@xxxxxxxxx>
  • Date: Wed, 18 Sep 2013 15:47:00 -0700 (PDT)
  • Cc: "xen-users@xxxxxxxxxxxxx" <xen-users@xxxxxxxxxxxxx>
  • Delivery-date: Wed, 18 Sep 2013 22:50:50 +0000
  • Domainkey-signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.com; h=X-YMail-OSG:Received:X-Rocket-MIMEInfo:X-Mailer:References:Message-ID:Date:From:Reply-To:Subject:To:Cc:In-Reply-To:MIME-Version:Content-Type; b=ka+GF92e/cbgKRS5HQyxk6adyu92eCV2MsyJEsA9XW99ljIbBQGQ3y0Sg1b5ql8IRvk6qbjw8HLEpIVXgVPqteQdCfFIgPaAyi5EU4/WnrOCqNmT+4twUbhaLMFufjcfZL7WcBcCnUddubnUTatsxt8OqnyFk7fp9mNf0B3ZPT8=;
  • List-id: Xen user discussion <xen-users.lists.xen.org>

Gordan: I was quite surprised to see that you recommend Asus motherboards over others. Particularly the Sabertooth MB made me laugh a bit. Well, I like the Asus boards as long as they don't run Linux / Xen. Recently some people reported BIOS issues with Asus AMD boards - see here: http://xen.1045712.n5.nabble.com/Xen-IOMMU-disabled-due-to-IVRS-table-Blah-blah-blah-td5716461.html. In my case, an Asus Sabertooth X79, a BIOS update had reportedly broke VT-d/IOMMU (luckily I didn't upgrade). When writing to Asus support at HQ to inquire about this issue, they first denied it and then told me to use Windows if I want any support. To me this translates into "buy elsewhere" since I use Xen / Linux. Essentially I'm now stuck with an Asus board where I can't run a BIOS upgrade (upgrading my boards' BIOS is irreversible).

So I guess experience varies. Since you said you use Asus boards, which boards and which BIOS release? Also, do they all run Linux / Xen (or similar with IOMMU)?

I don't know how Supermicro stand, but chances are that they support Linux since they are geared towards servers.

From: Gordan Bobic <gordan@xxxxxxxxxx>
To: Zir Blazer <zir_blazer@xxxxxxxxxxx>
Cc: xen-users@xxxxxxxxxxxxx
Sent: Wednesday, September 18, 2013 3:56 PM
Subject: Re: [Xen-users] Getting ready for a computer build for IOMMU virtualization, need some input regarding sharing multiple GPUs among VMs

On Tue, 17 Sep 2013 05:34:08 -0300, Zir Blazer <zir_blazer@xxxxxxxxxxx>
> I already sended this, here:
> http://lists.xen.org/archives/html/xen-users/2013-08/msg00228.html

[huge snip]

I suspect the reason you never got a reply is to do with the
lack of conciseness of your post - many people probably gave
up on reading before you actually got to the point.

> My last mails here has been regarding Hardware compatibility with the
> IOMMU virtualization feature, but I think I have that already nailed
> down. So, my next computer build specs will be:
> Processor: Intel Xeon E3-1245 V3 Haswell
> Alternatively a cheaper 1225 V3, 200 MHz slower and no Hyper
> Threading. I'm planning to use the integrated GPU. This Processor its
> nearly the same than the Core i7 4770 except that the Turbo Frequency
> is 100 MHz slower, but its slighty cheaper, got ECC Memory support,
> the integrated GPU is supposed to be able to use professional CAD
> certified Drivers (A la Firepro or Quadro), AND best of all, a name
> that stands out of the Desktop crowd.

Intel integrated GPUs are not that great. Unreal Tournament 2K4
runs fine on my Chromebook, but that isn't exactly a particularly
demanding game. If applications like the ones covered by the
SPECviewperf benchmark are your primary concern, and you are
on a budget, look into getting something like a Quadro 2000.
If your main goal is gaming get a GTX480 and BIOS-mod it
into a Quadro 6000 - you would get the Quadro performance in
SPECviewperf, but you will get perfectly working VGA passthrough.

Look here:

GTS450 -> Q2000

GTX470 -> Q5000

If you want something more recent that that, GTX680 can be
modified into a Quadro K5000 or half of a Grid K2, but this
requires a bit of soldering.

The upshot of going for a cheap Quadro or a Quadrified
Nvidia card is that rebooting VMs doesn't cause problems
which ATI cards are widely reported to suffer from.

> Additionally, I may want to undervolt it, check this Thread:
> http://forums.anandtech.com/showthread.php?t=2330764

You should be aware that Intel have changed VID control on Haswell
and later CPUs, so undervolt tuning based on clock multiplier
(e.g. using something like RMClock on Windows or PHC on Linux)
no longer works. If you want to use this functionality, you would
be better off picking a pre-Haswell CPU. I have this problem with
my Chromebook Pixel, which runs uncomfortably (if you keep it on
your lap) hot, even when mostly idle.

> Motherboard: Supermicro X10SAT (C226 Chipset)
> For my tastes, its a very expensive Motherboard. I know AsRock has
> been praised for their good VT-d/AMD-Vi support even on the cheap
> Desktop Motherboards, but I'm extremely attracted to the idea of
> building a computer with proper, quality Workstation-class parts. As
> soon as I find it in stock and at a good price in a vendor ships
> internationally, I'm ordering it along with the Processor. For more
> info regarding my Motherboard-finding quest, check this Thread:
> http://forums.anandtech.com/showthread.php?t=2326402

I have more or less given up on buying any non-Asus motherboards.
Switching to an EVGA SR-2 after completely trouble-free 5 years
with my Asus Maximus Extreme has really shown me just how good
Asus are compared to other manufacturers.

All things being equal, if I was doing my rig for similar purposes
as you (two virtual gaming VMs with VGA passthrough, one for me,
one for the wife), I would probably get something like an Asus
Sabertooth or Crosshair motherboard with an 8-Core AMD CPU. They
are reasonably priced, support ECC, and seem to have worked
quite well for VGA passthrough for may people on the list.

Avoid anything featuring Nvidia NF200 PCIe bridges at all cost.
That way lies pain and suffering. I'm in the process of working
on two patches for Xen just to make things workable on my
EVGA SR-2 (which has ALL if it's PCIe slots behind NF200

> Memory Modules: 32 GB / 4 * 8 GB AMD Performance Edition RP1866
> Already purchaseed these, thinking on making a RAMDisk. When I
> purchased them I didn't thinked that I would be able to use ECC as I
> decided to go the Xeon and C226 Chipset way, but oh well. Good thing
> is that I purchase them before their price skyrocketed.

I flat out refuse to run anything without ECC memory these days.
This is a major reason why I don't consider Core i chips an

> Video Card: 2 * Sapphire Radeon 5770 FLEX
> Still going strong after 2 years of Bitcoin mining, undervolting them
> did wonders.

Did you stability test them? GPUs come pre-overclocked to within
1% of death from the factory.

> Hard Disk: Samsung SpinPoint F3 1 TB
> Used to be a very popular model 3 years ago.

I'd avoid Samsung and WD disks at all cost. They are unreliable
and either their SMART lies about reallocated sector counts, or
worse, they re-use failing sectors rather than reallocate them.

I also wouldn't consider putting any non-expendable data on
anything but ZFS - silent corruption happens far more often
than most people imagine, especially on consumer grade desktop

> Power Supply: CoolerMaster Extreme Power Plus 460W
> Still going strong after 4 years. If it could power up my current
> machine (Same as above but with an Athlon II X4 620 and ASUS
> M4A785TD-V EVO), it will with the new Haswell.
> Monitors: Samsung SyncMaster 932N+ and P2370H
> I'm going to use Dual Monitors.

I find that ATI cards struggle with dual monitors, at least the
high end ones (I use IBM T221s which appears as 2-4 DVI monitors
due to signal bandwidth requirements). It's fine on Linux with
open source ATI drivers (just slow), but on XP I never managed
to get this to work at all with desktop stretching - most
games don't see any mode over 800x600. Things work fine with
Nvidia cards (except in games that have outright broken multi
monitor support, such as Metro Last Light).

> I'm intending on deploying Xen over a minimalistic Linux
> distribution,
> that would allow me to do basic administration like save/restore the
> VMs backup copies and have system diagnostic tools. Arch Linux seems
> great for that task, through I will have to check what I could add to
> it to be more user-friendly instead to having to rely only on console
> commands. The Hypervisor and its OS MUST be rock solid, and I suppose
> they will also be entirely safe if I don't allow it to have Internet
> access by itself, only VMs.

You might want to consdier XenServer (based on CentOS). The main
thing I'd suggest is keeping your VM storage on ZFS for easy
snapshotting and other manipulation. I use such a setup and it has
worked extremely well for me.

> 2 - A base Windows XP VHD that I would make copies to have as many
> VMs
> as needed. Its main purpose will be gaming. Reason why I may need
> more
> than one, is because currently, there are many instances where
> opening
> more than one game client from a MMO game I play, may sometimes cause
> a graphics glitch that slow downs my computer to a crawl and usually
> is unresponsive enough to not allow me to open the Task Manager and
> close the second client instance. This happen when I try to have two
> or more Direct3D based games running at the same time (Also, with
> other game, it can happen that the game client complains that
> Direct3D
> couldn't be loaded, but work properly after closing the already
> running clients). I have been googling around but can't find a name
> for this issue. I believe that having them on their own VM could
> solve
> it.

I didn't think firing up two D3D games at the same time was even
possible - or are you talking about just minimizing one?

> 4 - Possibily, another Linux VM where I can do load balancing with 2
> ISPs, as its probable that I end up with 2 ISPs on my home and I have
> yet to find a Dual WAN Router that doesn't cost a leg and a eye. If
> this is the case, all the other VMs Internet traffic should be routed
> via this one, I suppose.

You may find this is a lot easier to do on the host and carefully
choosing which devices to bridge VMs onto (i.e. in front or behind
the firewall bit). Load balancing across ISPs is always problematic,
though. There are all sorts of issues that crop up.

> 5 - Possibily, another Linux VM where I could send the two Radeons
> 5770 via passthrough to use them exclusively and unmolested for
> Litecoin mining, and let the integrated GPU handle everything else.

My understanding was that cost effectiveness of electricity +
hardware cost of hardware was nowdays such that it makes mining
cost-ineffective. But whatever floats your boat.

> 6 - Possibily, I could get another keyboard, mouse and monitor, and
> assign them to a VM, that could be used for some guest visitors, so
> they can simultaneously use my computer to browse, effectively making
> it a multi-user machine out of a single one. Can also work for a
> self-hosted LAN party for as long as there are enough USB ports :D

I have had issues with USB port passing - specifically due to
interrupt sharing. The only thing I've managed to get working reliably
is passing USB ports that don't share interrupts with anything else,
and ports frequently do share interrupts (even if not device IDs).
This is with passing PCI USB controller devices. I found passing
USB devices directly was problematic in other ways, not least of
which was the extra CPU usage and perceivable response lag.

It also occurs to me that by this point you will need a LOT
of PCIe slots for all your GPUs and USB controllers. And a lot
of desk space for all the monitors, keyboards and mice.

> Additionally, as I have tons of RAM but no SSD, I will surely use a
> RAMDisk.

32GB of RAM doesn't sound at all like much relative to what you are
actually trying to do. I'm doing half that much and am thinking it
would be handy to upgrade my machine from 48 to 96GB of RAM.

> Basically, I expect that I would be able to set up a RAMDisk
> a few GBs worth in size, copy the VHD that I want there, and load the
> VM at stupidly fast speeds. This should work for any VHD where I want
> the best possible IOPS performance and don't mind that it is volatile
> (Or backup often enough).

Even with PV drivers you'll likely bottleneck on CPU before you hit
the throughput a decent SSD is capable of. Especially if you run
your VMs off ZFS like I do. Granted, this is a ZFS issue, but
I find present day storage is too unreliable to be entrusted to
any FS without ZFS' extra checksumming and auto-healing features.

> Up to this point, everything is well. The problem is the next part...
> As when you do passthrough of a device, neither other VMs nor the
> Hypervisor can use them without reassigning that device

That is indeed correct - you cannot share a PCI device between
multiple VMs simultaneously.

> (And that
> device also needed a Soft Reset function or something like that if I
> did my homework properly),

You didn't do your homework correctly. Resetting is not really that
much of an issue if you pick your hardware carefully (e.g. avoid
NF200 PCIe bridges, ATI GPUs and motherboards that people on this
list haven't extensively tested to work in a trouble-free way).

> suddently I have to decide in what VMs I
> want my 3 GPUs and 2 Monitors (That should be physically connected to
> them, so video output will be of where the GPU is at that moment) to
> be at, and what could get away by using emulated Drivers (This should
> also apply to Audio, but I think than that one is fully emulable).

Not on XP it isn't. None of the QEMU emulated audio devices have
drivers on XP and later. Latest upstream QEMU supposedly has
Intel HDA audio emulation, but I haven't been able to test that yet.
The two things that I have verified to work OK is PCI passthrough
of audio devices (I am using an Sound Blaster PCIe) and USB audio
hanging off of PCI passthrough USB controllers.

So you'll need yet more PCI/PCIe slots and/or USB ports with
non-shared interrupts.

> Considering this, I suddently have to really think what goes where,
> which is what I can't decide.
> Assuming I do passthrough of the 2 GPUs to the said Linux VM for
> mining, I wouldn't need a Monitor attached to either, and I could do
> passthrough of the IGP to the Windows gaming VM with Dual Monitors.

There have been some long threads on the list recently about IGP
VGA passthrough. It looks like results are very hardware dependant.

And I'd be very surprised if you manage to get some serious
non-retro gaming done on the IGP. I'm currently using a quadrified
GTX480 (to Quadro 6000) for a 2560x1600 monitor and a quadrified
GTX680 (to Quadro K5000) for a 3840x2400 monitor, and the I could
do with more GPU power on both.

> However, in this case, I suppose that I wouldn't have video output
> from neither the Hypervisor nor any of the other VMs, including my
> everday Linux one, killing the whole point of it, unless that IGP can
> be automatically reassigned on-the-fly, which I doubt.

XP for one doesn't seem to handle GPU hotplug properly. Win7 did
when I briefly tested it, but that still sounds like a lot of hassle.
VGA passthrough is problematic enough as it is without such further

> This means that
> the most flexible approach, would be to leave the IGP with a single
> Monitor for the Hypervisor, and each 5770 to a Windows gaming VM, but
> then, I would be short on one Monitor. Basically, due to the fact
> that
> a Monitor is attached to a GPU that may or not be where I want the
> video output at, I may need to switch Monitors from output often,
> too.
> So in order to do it with only passthrough, I will have to take some
> decisions here.

More monitors or a KVM switch?

> So, I have to find other solutions. There are two that are very
> interesing, and technically both should be possible, through I don't
> know if anyone tried them.

I think your list of requirements is already spiraling out of
control, and if you really are mostly a Windows user, you may
find the amount of effort required to achieve a system this
complex is not worth the benefits. As I said, I'm finding myself
having to write patches to get my system working passably well,
and the complexity level on that is considerably lower than
what you are proposing.

> The first one, is figuring out if you can route the GPU output
> somewhere else instead of that Video Card's own video output. As far
> that I am aware, there are some Software solutions that allows to do
> something like this: One is Windows-based Lucid Virtu, and the other
> nVidia Optimus Drivers. Both are conceptually the same: They switch
> between the IGP and the discrete GPU depending on the workload.
> However, the Monitor is always attached to the IGP output, and what
> they do, is copying the framebuffer from the Video Card to the
> integrated one, so you can use the discrete GPU for processing while
> redirecting it to the IGP's video output. If you can do something
> like
> this on Xen, it would be extremely useful, because I could have the
> two Monitors always attached to the IGP and simply reassign the GPUs
> to different VMs as needed.

You could just use VNC for everything except your gaming VM(s).
Dynamic GPU switching between VMs is somewhat ambitious, but by all
means, have a go and write it up when/if you get it working properly.

> Another possible solution, would be assuming that the virtual GPU
> technologies catch up, as I am aware that XenServer, that is based on
> Xen, is supposedly able to use a special GPU Hypervisor that allows a
> single physical GPU to be shared in several VMs simultaneously as a
> virtual GPU (In the same fashion that VMs currently see the vCPUs).

This was only announced as a preview feature a few days ago.
I wouldn't count on it being as production-ready as you might hope.

VMware ESX does something similar in the most recent version, but
it's only supported on the Nvidia Grid cards. Those are _expensive_
but you might be able to get away with modifying some GTX680/GTX690
cards into Grids to get it working. You'll have to take a soldering
iron to them, though.

> This one sounds like THE ultimate solution. Officially, nVidia
> support
> this only on the GRID series, while AMD was going to release the
> Radeon Sky aimed for the same purpose, through I don't know what
> Software solutions it brings. However, it IS possible to mod Video
> Cards for them to be detected as their professional counterparts and
> maybe that allows the use of the advanced GPU virtualization
> technologies only available on these expensive series:
> http://www.nvidia.com/object/grid-vgx-software.html
> http://blogs.citrix.com/2013/08/26/preparing-for-true-hardware-gpu-sharing-for-vdi-with-xenserver-xendesktop-and-nvidia-grid/
> http://www.eevblog.com/forum/projects/hacking-nvidia-cards-into-their-professional-counterparts/

You will notice that the list of supported hardware for VMWare VGX
is extremely limited - and for very good reason. And other options
aren't yet production ready as far as I can tell - but I could
be wrong.

> I think that there are some people that likes to mod GeForces to
> Quadros because they're easier to passthrough in Xen. But I'm aiming
> one step above that should I want a GeForce @ Grid mod, as I think
> that full GPU virtualization would be a killer feature.

You better start reading through the Xen source code and get
ready to contribute patches to help make this work. :)

> All my issues are regarding this last part. Do someone have any input
> regarding what can and can not be currently done to manage this? I
> will need something quite experimental to make my setup work as I
> intend it to.

The input for me to be to stop dreaming and come up with a list
of requirements a quarter as long, and then maybe you can have
something workable in place with less than two weeks of effort
(assuming you take two weeks off work and have no other obligations
to take up any of your time).

> Another thing which could be a showstopper, is the 2 GB limit on VMs
> with VGA passthrough I have been hearing, through I suppose will get
> fixed in some future Xen version. I'm looking for ideas and people
> that already tried this experiences to deal with it.

One of the memory limitation bugs has been fixed in Xen 4.3.0. The
other (the one I've been having, courtesy of the NF200 PCIe bridges
being buggy) I have a workable-ish prototype patch for, but it's
nowhere nearly production ready. But these would be the least of
your problems with the above requirements.


Xen-users mailing list

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.