[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Theorycraft about Radeons VGA Passthrough issues

On 09/17/2014 10:04 AM, Zir Blazer wrote:
I sent this E-Mail two days ago to xen-users, but to be honest, I think
that developers should be interesed in this as well.

I think I have some interesing info regarding the infamous Radeon's
"performance glitch" issue.

I have a Radeon 5770, a Sapphire Flex Edition to be specific, should be
this one:

It had by default a highest Power State that runs the GPU @ 850 MHz,
VRAM @ 1200 MHz, and the GPU Voltage is 1.125V, if I recall correctly.
Because I usually underclock/undervolt absolutely everything in the name
of power efficiency, I had modified the Video Card BIOS with Radeon BIOS
Editor with custom Power States. You can actually flash it from within a
VM - I could flash mine with the ATI WinFlash tools. However, you need a
computer reboot for changes to take effect, restarting the VM isn't
enough. My current PowerPlay settings are these, along with a modified
Fan curve:


The VM where I use this Radeon uses WXP SP3 with the GPLPV Drivers I use Arch Linux as Dom0 with Xen 4.3.1 builded with the ATI
VGA Passthrough patch that was included in the Arch Linux User
Repository xen package (This patch is also currently provided for Xen
4.4.1, it builds properly with it but its claimed to not have been
tested). Syslinux is the Boot Loader, and I have specified in its config
file to hide the Radeon PCI address, so Dom0 shouldn't see or initialize

On a "good" VM start, I can run GPU-Z and it will see how my Video Card
switchs to different PowerStates (The ones I configured previously)
depending on load. That's good.


Sometimes, like when I shut down the VM then open it again, the Radeon
gets stuck on a Power State which is NOT a value from the PowerPlay
table which I modified. In my Video Card, it is GPU @ 850 MHz with VRAM
@ 1200 MHz, which was the highest default Power State on the original
BIOS, but it is not present in mine anymore. Also, GPU-Z and other tools
fails to report GPU Voltage, which I suppose should be 1.125V.
Temperatures on load are also accordingly much higher than what
archivable with my highest Power State, so does Fan noise, so I suppose
than the Video Card is really running at those values, through I didn't
benchmarked them (Should be faster for obvious reasons).


I suppose that when the Video Card fails to be fully initialized
properly, instead it falls back on a "backup" Power State, which in my
Video Card coincided with the highest one, and that Power State is not
part of the regular PowerPlay table. My theory is that when you have the
"performance glitchs", it is because other BIOSes may instead have a
backup Power State which should be close to what you expect of a power
saving mode, while in my case it is totally the opposite.

While some people claims that you need to do a full reboot of the
computer to do a reboot of the VM that uses the Radeon, I didn't had
such types of issues, is as simple as restarting the VM one or two more
times, and as soon as GPU-Z shows 150/300 or 700/800 I know it is good
to go. I *DID* needed full computer reboot for VBIOS flashs to take
effect, and while experimenting Frequency/Voltage settings, if the
Voltage was too low to be fully stable at that Frequency, I couldn't get
the GPU to work again without a computer reboot, with the VM always
BSODing on boot.
I recall having seen BIOSes in TechPower VGA BIOS Collection whose
PowerPlay tables had some weird values, like for example, Frequencies
appropiate for Idle with highest Power State GPU Voltage, and more
interesing, viceversa, which should be a no-go. Due to the fact that I
don't know what the "backup" settings are for when the Video Card
doesn't fully initialize properly, nor where they come from and if they
are rational values, I suspect that a bad combination of those hidden
values could be heavily related to this and is why these people have
this issue.

Sometimes I have experienced crashes on the WXP VM that forces me to
kill it from Dom0 with xl, and that leaves the screen on the Monitor
attached to the Radeon with a frozen screen. The next VM start it
displays some weird behaviator, because the Monitor gets refreshed with
a white screen, after some time it goes black, then some time after a
BSOD on Dom0 VM's screen follows. However, restarting the VM again, it
works properly. So for pretty much any non-GPU stability related issue,
closing and opening the VM a few times gets the Video Card initilized
properly sooner or later. So in the last 4-5 months of usage after
finishing my PowerPlay table which proved to be stable, there has been
no time where I had to actually reboot the computer to restart the VM in
any event (Normal shut down then creating it again with xl, or killing
the VM due crash or whatever).

I'm interesng in people that also had issues with VGA Passthrough and
Radeons to post GPU-Z screens (Or write the values) when they suffer
from performance degradation after VM restarts. I'm inclined to believe
than it is entirely related to the vBIOS backup Power State that it uses
when it fails to initialize properly.

That might explain the performance degradation, but it doesn't explain video corruption that is also goes with it in most cases.

I wonder if the additional patch you mention you use does something to help alleviate the hung state of the card to allow it to re-initialize. I tried with Xen 4.3.0 and I could only ever get ATI cards to work properly on the first boot of the VM.


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.