[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] ATI VGA Passthrough / Xen 4.2 / Linux 3.8.10

Top posting :P

Hello Gordan, Casey,

I hope you've had a good weekend.  I got back to my project this
morning; I decided to shove one of my 5850's into my board to see if I
could get it to work...

I've had this Windows DomU running, with GPLPV drivers, for a few
hours now.  Performance is excellent.  I'm using the 5850
passed-through as a PCIe device.  One of my 6990s is also plugged in,
and it's being used by Dom0.  Comically, I've got the better monitor
plugged into my Dom0's card because this 5850 lacks mini displayport

I also can't get gfx_passthru=1 to work.  Nothing happens other than
an SDL window claiming to be a Serial console showing up on my Dom0's
screen.  I even have the 5850 set up as my BIOS's primary video card.
Oh well :)

Gordan, I'm going to poke through your other email later and see if I
can present some information to help you line up any of your
suspicions.  Given the way things have gone for me---and I've
basically duplicated as much of your and Casey's setups as humanly
possible here---I've got to believe the problem here is ACS, or
something related to it.  I can even reboot this VM and the card just
keeps on working.

On another note, should we retire this thread soon?  It's getting a
bit long and I don't want to discourage any future googlers, nor get
too off topic :P


On Fri, May 10, 2013 at 6:39 PM, Gordan Bobic <gordan@xxxxxxxxxx> wrote:
> On 05/10/2013 09:19 PM, Andrew Bobulsky wrote:
>>              2) I actually have it working - for 5 minutes or so at a
>>         time. If
>>              the problem was the lack of ACS, it wouldn't work at all.
>>         I just can't help but wonder if it /is/ the problem, though.
>>           It's the
>>         only thing I can pin down that our situations have in common as
>>         far as
>>         its being the only "non-compatible" portion of the
>>         implementation, aside
>>         from the nearly identical behavior, of course. Maybe the AMD
>>         driver does
>>         some stupid stuff that ACS can mitigate?  I just wish I knew more
>> :(
>>     Now you got me thinking... I noticed that when the GPU starts to
>>     head toward the crash, this appears in the syslog:
>>     May  6 16:35:51 normandy kernel: pcieport 0000:00:03.0: AER:
>>     Multiple Uncorrected (Non-Fatal) error received: id=0000
>>     It certainly makes me wonder.
>>     Has anyone else seen this error?
>>     The device ID in question is:
>>     00:03.0 PCI bridge: Intel Corporation 5520/5500/X58 I/O Hub PCI
>>     Express Root Port 3 (rev 22)
>>     which does not bode well...
>>     Duff hardware?
>> Hmmm... I'll poke through my syslog at the next crash.  I tried:
>>         cat /var/log/syslog | grep pcieport
>>         cat /var/log/syslog.1 | grep pcieport
>>         dmesg | grep pcieport
>> Nothing came back from any of those.  I'll see if I can identify any
>> unique errors myself though!
> Worth paying attention to. :)
>>                  So what might intrigue you the most here is that while
>>         I'm stuck
>>                  with
>>                  a VGA device sitting behind this non-ACS compliant
>>         switch... My
>>                  results are almost identical to yours.  Passing one of
>>         the VGA
>>                  devices
>>                  to the DomU, with or without the corresponding HDMI audio
>>                  doesn't seem
>>                  to matter, I get this:
>>                  " it is so intermittent. It works well enough to boot
>>         up and
>>                  work with
>>                  a gaming type load for a few minutes. Then something
>>         happens that
>>                  causes the VGA card to require a reset, and it all
>>         falls apart."
>>                  Seriously :P
>>              And you are convinced this is to do with the availability
>>         of ACS?
>>         Like I said, it's the only thing that I can pinpoint as being a
>>         hindrance to compatibility.  I guess my request here is if
>>         anyone can
>>         help me determine whether or not that's true?
>>     What motherboard are you using? Has anyone successfully used it for
>>     VGA passthrough? I don't think the possibility of both of us having
>>     similarly duff hardware has been systematically excluded yet.
>> I think I said it, but I'll link here anyway:
>> http://www.gigabyte.us/products/product-page.aspx?pid=2957#ov
> Indeed, you did. Apologies, it's been a long week. :p
>> As to whether or not anyone's used it for passthrough before... I've got
>> no clue.  Probably not too many people, seeing as how I'm essentially
>> running a custom BIOS :P
> BIOSes are getting so crap (except maybe on Asus boards) these days that I'm
> amazed anything works at all. You wouldn't believe the amount of BIOS
> buggyness people are encountering on the SR2, and that's now an EOL product
> that should by now have had most of it's bugs fixed (yeah - right).
>>                  It eventually likes to BSOD, usually on atikmpag.sys I
>>         think.
>>                    Plenty
>>                  of "an attempt was made to reset the display adapter
>>         and failed"
>>                  blah
>>                  blah blah.
>>              Yes, all too familiar.
>>                  This happens 100% of the time if I try to boot with both
>>                  devices attached.
>>              Both devices?
>>         Yes---that is to say both of the VGA controllers from the 6990.
>> The
>>         relevant portion of my lspci looks like this:
>>         http://pastebin.com/raw.php?i=__GwekPNAW
>>         <http://pastebin.com/raw.php?i=GwekPNAW>
>>     OK, I get it. I seem to remember reading in the archives that dual
>>     VGA passthrough is problematic (my experience over the years shows
>>     that multiple GPUs are a false economy of highly questionably
>> benefit).
>> That's actually pretty much completely accurate.  It drives me
>> particularly up the wall because I hate running things in full screen,
>> and crossfire basically doesn't work at all without that :P
> I like my full screen gaming - but throw something obscure like an IBM T221
> into the mix and things start to get rather non-trivial. T221 is 3840x2400
> which is too much for DL-DVI to drive. But it's a 10+ year old monitor
> design and it actually takes 3xSL-DVI (but there's an adapter available that
> makes it drivable using 2xDL-DVI instead).
> Then you have to stitch the screens together (workable with 2xDL-DVI on XP,
> you need a Quadro or an Eyefinity card for the driver features to do it on
> Vista and 7). What I've found back when my old 4870X2 was bleeding edge was
> that with dual monitors attached, the 2nd GPU never did anything at all
> (stayed stone cold, performance unaffected by Crossfire).
> Since then I've learned my lesson - buy the biggest single GPU you can
> afford - it's as good as it's going to get. Everything else is going to be
> hit-and-miss. Debugging other people's products may be fun when you're 14,
> but I'm two decades too old to not have something better to do with my time.
> Nowdays I appreciate things that "just work" - the unfortunate thing I'm
> finding, however, is that there tend to be no things that "just work" that
> include all the features that I want - which in turn leads to endless
> debugging of other people's software to get it to do what I want, because
> apparently, nobody else has tried it before. :-/
>>         Note: devices 09 and 0a are my "primary" 6990's vga controllers.
>>           Also,
>>         my crossfire bridge is disconnected.  I'm working with the other
>>         card,
>>         devices 0d and 0e.  I've included the USB card as well in the list
>>         because I'm using it, but it causes me no problems whatsoever.
>>           For what
>>         its worth, that USB card works great in ESXi as well... Highpoint
>>         enabled ACS on their PEX chips :D
>>              Just out of interest:
>>              1) Are you using a multi-socket motherboard?
>>         Nope!  It's a Gigabyte GA-EX58-EXTREME.  It's LGA1366 with an i7
>>         920 in
>>         it.  VT-d support is provided through a hacked BIOS image that I
>>         found
>>         on the web a couple years or so ago.
>>     Having to use a hacked BIOS for VT-d support is not a good sign or a
>>     good starting point...
>> Technically, you're right.  AFAIK though, this particular generation of
>> i7 chips allows for VT-d to be managed entirely by the chipset/bios.
> That's just it - I don't like things only manageable by binary blobs with no
> source code. I'd much rather just have a clean interface (e.g. from /sys/)
> to just write the relevant registers straight to the hardware to
> enable/disable features. Otherwise you're at the mercy of motherboard
> manufacturers who have no interest in supporting a product for people who
> have already bought it (sale's made, why should they care).
>>   There's no particular req (however artificial) coming out of the CPUs
>> for this generation that stipulates VT-d can't be patched in... so I
>> figured, "why not?"  I was modding my BIOS anyway and decided to use
>> this one as a base because it had both VT-d and fully updated option
>> ROMs for all my onboard stuff.  The world of BIOS modding is a /very/
>> neat one; I highly suggest every nerd spend a few days there at some
>> point in his life ;)
> Last time I checked, this was mostly limited to people using BIOS editors to
> unhide features. Have things actually progressed to the point where you can
> add in a specific assembly payload to initialize things differently?
>> To the point though, it seems very well behaved on everything that
>> /isn't/ my 6990 :-(
> Didn't you mention you had another ATI GPU in another rig that you could
> borrow temporarily? It might be worth a shot to see if it's the dual GPUs
> that are foiling you. Especially since they are inevitable on the same PCIe
> bridge. A standalone single GPU might just work.
> Ironically, my Quadro has been refusing to play ball completely today (it
> worked passably well yesterday, although not as well as my 6450 card, which
> today seems to be working well enough to get to the login screen without
> BSOD-ing. Different slot this time, though, so we'll see how it fares in a
> bit.
> [noirqbalance, limiting guest to 3.5GB of RAM]
> [screen corruption, white/black lines]
>> Yeah.  I'm convinced now.  They might be a different color, but they're
>> in chrome (which uses a GPU accelerated 2d canvas) and they seem to
>> precede the crash pretty reliably.
> Yes, similar here, although I don't use Chrome - I get them in most things,
> including on the desktop once it has all started to go wrong.
>>                  though I'm considering a hard-hack: think
>>                  of a 12v relay and a PCIe extender cable---if a D3D0
>>         reset actually
>>                  powers off the slot momentarily but the PSU plugs on
>>         the card
>>                  prevent
>>                  it from working, then I could rig up a switch that ties
>>         those plugs'
>>                  power state into the slot itself---it's radical, yes, but
>>                  possibly the
>>                  most inventive solution I can think of so far.  I'm
>>         super curious to
>>                  see if anyone more knowledgeable than myself thinks it
>>         would work,
>>                  because it'd be super cheap to build!  As the saying goes
>>                  though, I'll
>>                  "cross that bridge when I come to it." :)
>>              Interesting. In theory, I think this _should_ work provider
>>         your PCIe
>>              bridges support hot-plugging.
>>              To be certain, you'd have to switch both the PCIe slot and
>>         (if your card
>>              uses it) the external power inputs.
>>         That'd be the idea.  Assuming it works the way I think it does,
>>         I could
>>         tap a 12v (I'm pretty sure it's 12v in there) relay into the Vcc
>>         and GND
>>         pins of the PCIe slot and use the relay's output to switch the
>>         Vcc from
>>         the plug-in cables off of the PSU.  Bears testing with a
>>         slightly less
>>         expensive card, but I wouldn't be surprised to see it work!  It'd
>>         require some case modding for sure though, as the extension
>>         cable will
>>         get in the way of properly seating the card.  It could be
>>         possible to
>>         build a tap that could be "slipped in" to a card's PCIe slot...
>>           Short
>>         of proper FLR support, this could actually very cheaply be built
>>         into
>>         the expansion card itself.  I'd suspect that simply adding FLR
>>         would be
>>         cheaper on the card manufacturers though. :)
>>     Just get a case with more slot cutouts on the back than your
>>     motherboard has slots. Then feed the ribbon to the bottom so the
>>     card sits in the slot on the case that is below your motherboard -
>>     no modding required. :)
>> But... but!  I guess that'd require a mini(?) or MicroATX board.  I'm a
>> full size to XL ATX (or whatever the monster-sized boards are) kind of
>> guy.  Guess I just want more slots to pass GPUs to VMs, eh? :)
> You don't need a smaller motherboard - you need a bigger case. :)
> With your board, you could probably do this with a PC-P80 Armorsuit (one of
> the few off the shelf cases that will take my SR-2 due to a weird,
> needlessly oversized form factor - I mean seriously, who needs 7 PCIe x16
> slots??).
> Hmm... Something just occurred to me - on the SR-2 this could be implemented
> _TRIVIALLY_! The SR-2 has jumpers to disable/enable each of the PCIe slots.
> So in theory, all I'd have to do is put together a simple USB controlled
> witch that would toggle between connecting pins 1-2 and 2-3, and attach it
> using a normal 3-pin jumper-type header to the jumper block in question. Or
> (boringly), just wire it up to a suitable button on the front of the case.
> I might just have to try this and see what happens (and hope it doesn't make
> the magic smoke escape from something).
>> There's supposed to be some cases out there that allow for mounting of
>> expansion cards on the end of flexible extenders.  Haven't heard about
>> them in a couple years, but either way chances are pretty good that such
>> cases aren't exactly affordable... they likely target enterprise
>> customers or simply have limited runs... economy of scale and all that.
>>   Probably the "slip-in" type of adapter/approach would be best, but I
>> don't wanna get ahead of myself on a simple idea that may not even work :P
> Usually rack-mount cases.
> But it's amazing what you can achieve with a dremel and a power drill in a
> few minutes. ;)
>>                  With that in mind, even though I've taken your advice
>>         and added the
>>                  config info to my xend files, its entirely
>>         possible---especially in
>>                  light of what Casey said---that I'm just Doing It
>>         Wrong(TM).  It'd
>>                  likely be beneficial for us both to compare notes on that
>>                  regard.  If
>>                  either of you would be willing to help, I could
>>         probably use some
>>                  pointers... I've kinda run out of logs to look at with
>>         my current
>>                  knowledge on the subject :P
>>              Certainly - what notes do you propose we compare?
>>         I'm not completely sure.  If you can point me to the proper files
>> to
>>         verify that my device has the same PCIe-level compatibility
>>         issues as
>>         yours (verify that ACS isn't available to the device and so on)
>>         then I'd
>>         call that a step in the right direction.
>>     Another thing - Do "lspci -vt" - can you put the card in a slot
>>     where it doesn't share a bridge with any other PCIe devices?
>> I don't think so.  You should see the built-in bridge... it's implied
>> slightly up the hierarchy from the two side-by-side 6990 devices, which
>> itself attaches to the root port at the top:
>> http://pastebin.com/raw.php?i=4dGmneYi
> But the 2 GPUs are inevitably on the same bridge. I think trying a single
> GPU would definitely be a good next step in troubleshooting.
>> Wish me luck!
> To both of us! :)
> Gordan
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxx
> http://lists.xen.org/xen-users

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.