[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] ATI VGA Passthrough / Xen 4.2 / Linux 3.8.10



On 05/10/2013 08:42 PM, Casey DeLorme wrote:

             2) Have you tried disabling IRQ balancing
             (noirqbalance kernel parameter + disable irqbalance service)?


        No clue what that is.  Can you provide any direction?  I'd be
        happy to
        test.


    In your boot loader, find the kernel and xen lines and add:

    On the xen line:
    noirqbalance

    On the dom0 kernel line:
    noirqbalance


How would removing noirqbalance help fix the problem?  Just curious; as
I understand it that tool is used to balance requests like a scheduler
of sorts.

I am purely guessing here, but could it be possible that if the VM uses a CPU other than the CPU that handles the interrupts for the hardware it has been passed strange things happen, possibly more so if the CPU in question is not only not the same core, but not even the same socket. It's possible that disabling rotating the interrupt handling between the cores alleviates an issue with IRQ routing.

But take this with a bucket of salt - I am _purely_ guessing here.

             3) Are you assigning > 4GB of RAM to the guest? I found a post
             in the archive last night mentioning that there's an
        outstanding qemu
             issue with > 4GB of RAM given to the guest. I didn't get
        around to
             re-trying the VM with 3.5GB yet.


        Yes sir.  It's got 8 GB + 1 GB for the standard video adapter.
          Not sure
        if that's improper, but it boots just find with a single card,
        and the
        5850 I plugged in for a short while seemed well behaved.  Here's
        a copy
        of my vm config file: http://pastebin.com/bX0ayA0u


    I think reducing the guest RAM to 3.5GB is worth a shot, along with
    only passing a single GPU device.


If I recall the RAM limit is specific to PV guests or older versions of
Xen.  I have run Windows with 4, 6, 8, and 16GB of RAM without ever
encountering this problem, and this includes tests with the xl toolstack
on Xen 4.1.2.

Including VGA passthrough on those guests?

                 The only single GPU cards I have are the Radeon 5850s
        in the AMD
                 box I
                 have.  I'm just a little reticent to tear the thing
        apart though
                 cause
                 it gets used a lot.  I think my next step is to look
        for a video
                 card
                 that properly supports FLR,


             As far as I can tell, for all the talk of it - there is NO
        SUCH THING.
             Somebody on the list posted lspci -vvv from their ATI
        FirePro card
             which shows it has no FLR, and I have just got a Quadro
        2000, which also
             lacks FLR.

             The only vague mention I have seen of FLR on GPUs is on the
        Intel GPU on
             the very latest generation of Core i CPUs (the built in
        one). And even
             if that is true it's not all that useful for gaming.


        Heh.  The crappiest GPU that would ever be in my system is the most
        compatible?  Good grief. :P


    I'm not sure about compatible, but it seems to have a feature that
    the others don't - then again, take that with a pinch of salt - I
    don't have one, and I tend not to believe such things until somebody
    shows me the lspci dump that proves it.


Where did you find mention of the newer integrated graphics supporting
FLR?  I have an IvyBridge 3770 with an HD4000, but when I ran lspci -vv
and -vvv I did not see FLReset+, but maybe I did something incorrectly
as I also did not see any mention of FLReset anywhere?  If the Ivybridge
integrated has FLReset I would totally want to test it.  It may not be a
powerful chip compared to modern discrete cards, and it won't prove that
the lack of FLR is the cause of our AMD/nVidia problems, but it would
show the effect the presence of FLR has.

I came across a post on a forum or a miling list from someone after googling something like

"GPU" "FLreset+"

and then trawling to a few hundred pages to find one that actually lists lspci output that is referring to a GPU.

Having said that, I have also found references to people claiming that FirePro and Quadro cars have FLR, which is quite clearly not the case.

So let's not assume that it's true just because somebody on the internet said so. :)

                                   2) My motherboard's PCIe slots are behind
                         NF200 PCIe bridges
                                (yes,
                                EVGA have decided in their infinite
        wisdom to put
                         all 7 PCIe slots
                                behind NF200s, none are directly
        attached to the
                         Intel NB).

                                  I'm so sorry :P. NF200 has probably
        caused a
                         lot of xen
                                tinkerers to
                                  utter a few dozen cuss words a piece.

                                  I can believe that. What is the
        solution, though?

                                  The thing that drives me really nuts
        about the
                         issues I'm seeing
                                (which may or may not be specifically
        related to
                         the NF200) is
                                that it
                                is so intermittent. It works well enough
        to boot
                         up and work with a
                                gaming type load for a few minutes. Then
                         something happens that
                                causes
                                the VGA card to require a reset, and it
        all falls
                         apart.

                                My solution was to buy another
        motherboard, I had
                         no luck at all
                                passing the devices behind the NF200,
        and similar
                         to your situation
                                all but one PCIe slot on that board was
        behind
                         that bridge.


                            Did you not manage to get it working at all?
        Or was
                         it just
                            intermittent like in my case? I can
        typically get
                         about 5 minutes of
                            gaming out of my ATI card before it all goes
        wrong.

                            Ironically, I was thinking about an Asus
        Sabertooth
                         with an 8-core AMD,
                            but opted to go for broke and get a couple
        of 6-core
                         Xeons and an
                            EVGA SR-2. It turns out, a solution that is
        4x more
                         expensive isn't
                            actually better... :(


                         I was unable to get it working at all.  The
        NF200 simply
                         threw errors
                         that 100% prevented me from passing the device.
          I think
                         it was missing
                         a number of specific features required for
        passthrough,
                         and I vaguely
                         remember running lspci -vvv to verify what was
        missing.
                           Perhaps not all
                         NF200's are created equal?


                     The only logged issue I had with the NF200s was the
        lack of
                     ACS, which
                     can be disabled as I mentioned on this thread (at
        least if
                     you are using
                     the xm stack). After I disabled that PCI
        passthrough has
                     been working OK.
                     It's just VGA passthrough BSOD-ing after some
        minutes that
                     is causing me
                     problems.


                 In reading up on the wiki, there does indeed seem to be
        a lot more
                 info regarding the use of xl and PCI Passthrough today
        than the last
                 time I looked.  It seems that these types of configuration
                 options are
                 set on a domain-by-domain basis, or even by device;
        docs say that
                 things like VPCI vs direct PASS mapping of slot
        layout(?) is
                 actually
                 configured at the device level either in your DomU
        config file
                 (like:
                 pci = ['0:d:0.0, pci-just-forking-work-damn-____you])
        or via xl
                 (like: xl
                 pci-attach 1 0:d:0.0 pci-just-forking-work-damn-____you).



             Hmm... I honestly don't think the xl way will succeed where
        xm is
             unstable,
             but I might give it a shot.


        You'd still likely require all the "hacks" you're currently
        using, but
        they'll all move to different places I'm guessing... if the
        toolstack
        itself doesn't have any bearing on this (which is my suspicion)
        then you
        don't want to go doing all the extra work for nothing, of course!


    Exactly. And right now what I have read (somebody point me to
    something that says otherwise), more people seem to have reported
    success with xm than xl stacks (but that could just be due to the xl
    stack being much more recent).


I would go as far as to say that most of those reports came from people
who used the packaged Xen, and until very recently the packaged Xen was
4.0 or 4.1 where xm is still the default toolstack.

Which I don't find to be in any way an encouragement to even attempt to do this using the xl tool stack at the moment. :)

Gordan


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.