[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] ATI VGA Passthrough / Xen 4.2 / Linux 3.8.10
On 05/10/2013 08:42 PM, Casey DeLorme wrote: 2) Have you tried disabling IRQ balancing (noirqbalance kernel parameter + disable irqbalance service)? No clue what that is. Can you provide any direction? I'd be happy to test. In your boot loader, find the kernel and xen lines and add: On the xen line: noirqbalance On the dom0 kernel line: noirqbalance How would removing noirqbalance help fix the problem? Just curious; as I understand it that tool is used to balance requests like a scheduler of sorts. I am purely guessing here, but could it be possible that if the VM uses a CPU other than the CPU that handles the interrupts for the hardware it has been passed strange things happen, possibly more so if the CPU in question is not only not the same core, but not even the same socket. It's possible that disabling rotating the interrupt handling between the cores alleviates an issue with IRQ routing. But take this with a bucket of salt - I am _purely_ guessing here. 3) Are you assigning > 4GB of RAM to the guest? I found a post in the archive last night mentioning that there's an outstanding qemu issue with > 4GB of RAM given to the guest. I didn't get around to re-trying the VM with 3.5GB yet. Yes sir. It's got 8 GB + 1 GB for the standard video adapter. Not sure if that's improper, but it boots just find with a single card, and the 5850 I plugged in for a short while seemed well behaved. Here's a copy of my vm config file: http://pastebin.com/bX0ayA0u I think reducing the guest RAM to 3.5GB is worth a shot, along with only passing a single GPU device. If I recall the RAM limit is specific to PV guests or older versions of Xen. I have run Windows with 4, 6, 8, and 16GB of RAM without ever encountering this problem, and this includes tests with the xl toolstack on Xen 4.1.2. Including VGA passthrough on those guests? The only single GPU cards I have are the Radeon 5850s in the AMD box I have. I'm just a little reticent to tear the thing apart though cause it gets used a lot. I think my next step is to look for a video card that properly supports FLR, As far as I can tell, for all the talk of it - there is NO SUCH THING. Somebody on the list posted lspci -vvv from their ATI FirePro card which shows it has no FLR, and I have just got a Quadro 2000, which also lacks FLR. The only vague mention I have seen of FLR on GPUs is on the Intel GPU on the very latest generation of Core i CPUs (the built in one). And even if that is true it's not all that useful for gaming. Heh. The crappiest GPU that would ever be in my system is the most compatible? Good grief. :P I'm not sure about compatible, but it seems to have a feature that the others don't - then again, take that with a pinch of salt - I don't have one, and I tend not to believe such things until somebody shows me the lspci dump that proves it. Where did you find mention of the newer integrated graphics supporting FLR? I have an IvyBridge 3770 with an HD4000, but when I ran lspci -vv and -vvv I did not see FLReset+, but maybe I did something incorrectly as I also did not see any mention of FLReset anywhere? If the Ivybridge integrated has FLReset I would totally want to test it. It may not be a powerful chip compared to modern discrete cards, and it won't prove that the lack of FLR is the cause of our AMD/nVidia problems, but it would show the effect the presence of FLR has. I came across a post on a forum or a miling list from someone after googling something like "GPU" "FLreset+"and then trawling to a few hundred pages to find one that actually lists lspci output that is referring to a GPU. Having said that, I have also found references to people claiming that FirePro and Quadro cars have FLR, which is quite clearly not the case. So let's not assume that it's true just because somebody on the internet said so. :) 2) My motherboard's PCIe slots are behind NF200 PCIe bridges (yes, EVGA have decided in their infinite wisdom to put all 7 PCIe slots behind NF200s, none are directly attached to the Intel NB). I'm so sorry :P. NF200 has probably caused a lot of xen tinkerers to utter a few dozen cuss words a piece. I can believe that. What is the solution, though? The thing that drives me really nuts about the issues I'm seeing (which may or may not be specifically related to the NF200) is that it is so intermittent. It works well enough to boot up and work with a gaming type load for a few minutes. Then something happens that causes the VGA card to require a reset, and it all falls apart. My solution was to buy another motherboard, I had no luck at all passing the devices behind the NF200, and similar to your situation all but one PCIe slot on that board was behind that bridge. Did you not manage to get it working at all? Or was it just intermittent like in my case? I can typically get about 5 minutes of gaming out of my ATI card before it all goes wrong. Ironically, I was thinking about an Asus Sabertooth with an 8-core AMD, but opted to go for broke and get a couple of 6-core Xeons and an EVGA SR-2. It turns out, a solution that is 4x more expensive isn't actually better... :( I was unable to get it working at all. The NF200 simply threw errors that 100% prevented me from passing the device. I think it was missing a number of specific features required for passthrough, and I vaguely remember running lspci -vvv to verify what was missing. Perhaps not all NF200's are created equal? The only logged issue I had with the NF200s was the lack of ACS, which can be disabled as I mentioned on this thread (at least if you are using the xm stack). After I disabled that PCI passthrough has been working OK. It's just VGA passthrough BSOD-ing after some minutes that is causing me problems. In reading up on the wiki, there does indeed seem to be a lot more info regarding the use of xl and PCI Passthrough today than the last time I looked. It seems that these types of configuration options are set on a domain-by-domain basis, or even by device; docs say that things like VPCI vs direct PASS mapping of slot layout(?) is actually configured at the device level either in your DomU config file (like: pci = ['0:d:0.0, pci-just-forking-work-damn-____you]) or via xl (like: xl pci-attach 1 0:d:0.0 pci-just-forking-work-damn-____you). Hmm... I honestly don't think the xl way will succeed where xm is unstable, but I might give it a shot. You'd still likely require all the "hacks" you're currently using, but they'll all move to different places I'm guessing... if the toolstack itself doesn't have any bearing on this (which is my suspicion) then you don't want to go doing all the extra work for nothing, of course! Exactly. And right now what I have read (somebody point me to something that says otherwise), more people seem to have reported success with xm than xl stacks (but that could just be due to the xl stack being much more recent). I would go as far as to say that most of those reports came from people who used the packaged Xen, and until very recently the packaged Xen was 4.0 or 4.1 where xm is still the default toolstack. Which I don't find to be in any way an encouragement to even attempt to do this using the xl tool stack at the moment. :) Gordan _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx http://lists.xen.org/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |