[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Status of FLR in Xen 4.4



On Fri, 27 Sep 2013 14:26:31 +0200, Matthias <matthias.kannenberg@xxxxxxxxxxxxxx> wrote:
Hi Gordon,

I tried your patch on my dom0 kernel and I think it somehow helped in
the sense that now I can reboot the domUs now without crashing the
whole host, but linux domU still gets a blackscreen and windows7 domU
only starts till black screen with (actual movable) cursor, but not
furthor.. this might only be a coincidence, though, have to double
check this..

What patch? Nothing I posted to the list is fit for public
consumption yet. You shouldn't be using it unless you really,
REALLY know exactly what it does and know exactly what you
are trying to achieve.

I tried some other stuff, too:

1) after domU shutdown rebind both functions to the dom0 drivers, do a
sysfs reset and re-add to assignable devices -> crashes dom0

My experience shows that letting dom0 drivers ever touch the hardware
is a recipe for disaster.

2) after domU shutdown rebind both functions to the dom0 drivers and
readd to assignable devices -> dom0 crashes somtime when domU using
the devices comes up, sometimes not, but no success either way
 3) sysfs reset of the devices within domU seems to be passed through
dom0 (see commands in qemu-log) but no effect

It's up to the drivers to do the sensible thing. Nvidia drivers
handle this a little more sanely, but if the drivers cannot handle
clobbering the device's state into a known state, you are pretty
much fighting a losing battle.

Also, I analysed your code and compared it to the stuff in the python
tools of xm and it is the same approach and i don't see any obvious
differences..

I am starting to suspect you aren't actually talking about my code
but somebody else's...

Then I tried to replicate the secondary bus reset on
command lind for testing purposes via

printf 'x40' | dd of=/sys/devices/pci0000:00/0000:00:0b.0/config bs=1
seek=$((0x3e)) count=1 conv=notrunc

but I think I got some endians or offset slightly wrong because after
that xl refuses to give the device (00:0b.0 is the bus of my
2-function vga card I have assigned to my domU) to the domU and later
crashes dom0.

So I'm a little lost at that point and would welcome some suggestions.

Does FLR reset works for any of you for vga cards?

If you are talking about VGA cards with _proper_ FLR implementations
on PCI level - there is no such thing. In all cases it is down to
the domU driver to handle the card in whatever state it is. This
works reasonably well with supported Nvidia cards (i.e.
Quadro [K][2456]000 and Grid K[12] and equivalent modified GeForce
cards (Fermi 4xx and Kepler 6xx/7xx series)). I never managed to
get it working properly on any other GPUs.

Even with Nvidia cards rebooting can lead to issues. For example,
I have two GPUs passed to two different domUs. One is a GTX470
modified to Q5000. The other is a GTX480 modified to Q6000. The
domU with Q5000 always handled reboots reasonably reliably. The
one with a Q6000 did not. I since switched the one with a Q6000
to a QK5000 (modified GTX680), and now the reboots seem to work
reasonably reliably, but I have found that there is still a
crash if the monitor on the card changes between shutdown and
restart - I'm guessing the card remembers it's state and if it
isn't consistent when it returns, driver gets confused. I have
other issues (see recent thread about Nvidia passthrough from
David), but they seem to be specific to my setup.

It's not perfect, but it's the only workable solution I have
found.

Gordan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.